[Linux-kernel-mentees] Fix for BAD_SIGN_OFF: non-standard signature

Lukas Bulwahn lukas.bulwahn at gmail.com
Wed Nov 18 19:17:06 UTC 2020


On Mi., 18. Nov. 2020 at 11:12, Aditya <yashsri421 at gmail.com> wrote:

> On 18/11/20 2:24 am, Aditya wrote:
> > On 17/11/20 11:12 pm, Lukas Bulwahn wrote:
> >> On Tue, Nov 17, 2020 at 7:03 PM Aditya <yashsri421 at gmail.com> wrote:
> >>>
> >>> On 13/11/20 11:55 pm, Aditya wrote:
> >>>> On 13/11/20 8:56 pm, Lukas Bulwahn wrote:
> >>>>> On Fri, Nov 13, 2020 at 4:00 PM Aditya <yashsri421 at gmail.com> wrote:
> >>>>>>
> >>>>>> On 13/11/20 8:05 pm, Aditya wrote:
> >>>>>>> On 12/11/20 1:34 am, Lukas Bulwahn wrote:
> >>>>>>>> On Wed, Nov 11, 2020 at 3:13 PM Aditya <yashsri421 at gmail.com>
> wrote:
> >>>>>>>>>
> >>>>>>>>> Hi Sir
> >>>>>>>>> I have analyzed the checkpatch report for BAD_SIGN_OFF(over
> >>>>>>>>> v4.13..v5.8) for non-standard signature and generated reports
> for it.
> >>>>>>>>> Some mistakes are more frequent than others, whereas some
> mistakes
> >>>>>>>>> even have a frequency of 1.
> >>>>>>>>>
> >>>>>>>>> Non-standard signatures occurring with their frequency:
> >>>>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt
> >>>>>>>>>
> >>>>>>>>> Complete warning messages:
> >>>>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/warn_msgs.txt
> >>>>>>>>>
> >>>>>>>>> Should I implement the fix similar to TYPO_FIX, where we have a
> >>>>>>>>> separate file for common misspellings and corrected words? Or
> should I
> >>>>>>>>> make a hash of these misspellings in checkpatch.pl file as well?
> >>>>>>>>>
> >>>>>>>>> Also should I include all these misspelled words in it? Or omit
> words
> >>>>>>>>> below certain frequency?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I think the best way would be to compute some kind of edit
> distance to
> >>>>>>>> the known signature tags and if this edit distance is below a
> certain
> >>>>>>>> threshold, suggest that signature tag as the fix. We can then
> evaluate
> >>>>>>>> to determine the best suitable threshold. The edit distance
> between
> >>>>>>>> the different tags are so large that this should always work as
> >>>>>>>> intended.
> >>>>>>>>
> >>>>>>>> Then, we can look into these other creative tags and propose
> suitable
> >>>>>>>> existing tags for the more frequent ones that are non-standard.
> Or in
> >>>>>>>> the case, none of the existing ones fit we can start the
> discussion on
> >>>>>>>> proposing some new standard ones.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I have generated a list of non-standard signatures and their fixes
> on
> >>>>>>> the basis of edit distance.
> >>>>>>>
> >>>>>>> This is the common list of non standard signatures and fixes (in
> >>>>>>> detail):
> >>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/min_dists.txt
> >>>>>>>
> >>>>>>> As I observed, I think, we can consider '<=2' as the threshold edit
> >>>>>>> distance.
> >>>>>>> List for non-standard signature and their proposed fix with edit
> >>>>>>> distance<=2 :
> >>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than_3.txt
> >>>>>>>
> >>>>>>> I have also generated lists for 3 and 4 edit distance separately
> for
> >>>>>>> reference:
> >>>>>>> Equal to 3:
> >>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_3.txt
> >>>>>>>
> >>>>>>> Equal to 4:
> >>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_4.txt
> >>>>>>>
> >>>>>>> For the rest I guess we'll need to hard code eg. for 'Debugged-by',
> >>>>>>> 'Requested-by' etc.
> >>>>>>>
> >>>>>>> These are the complete lists of non-standard signatures:
> >>>>>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt
> >>>>>>>
> >>>>>
> >>>>> Can you share which non-standard-signatures would be
> >>>>> handled/transformed with edit distance 2 and which would not in a
> >>>>> similar format to non_standard_signs.txt (so, ordered by frequency).
> >>>>>
> >>>>> We can then consider those that remain and find a good next strategy
> >>>>> for the most frequent non-standard signatures.
> >>>>>
> >>>>
> >>>> Non standard signatures handled with edit distance 2:
> >>>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt
> >>>>
> >>>> Non standard signatures with edit distance greater than 2:
> >>>>
> https://github.com/AdityaSrivast/kernel-tasks/tree/master/random/non_standard_signature/more_than2
> >>>>
> >>>
> >>> I think this mail probably got missed. I'll summarize it a bit for
> >>> simplicity:
> >>> With edit distance approach and threshold as 2, we're able to handle
> >>> 39 out of 109 'distinct' cases of non-standard signature. In this 39,
> >>> the maximum count of non-standard signature is 19 for 'Reviwed-by:'; 9
> >>> for 'Reviewd-by:' and other common mispellings.
> >>> Complete List:
> >>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt
> >>>
> >>> However, still we are unable to account for 70 non-standard signatures
> >>> which occur more frequently (eg 'Debugged-by:', which has occurred 61
> >>> times; 'Requested-by:', 48 times; and so on).
> >>> Complete list:
> >>>
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/more_than2/signs_freq.txt
> >>>
> >>> I think for these cases we'd need to make some file (as is used for
> >>> TYPO_SPELLING), or hash.
> >>> What do you think/suggest?
> >>>
> >>
> >> Yes, I agree.
> >>
> >> Goal 1: Try to map all the non-default signatures to their "standard"
> >> counterpart as much as possible.
> >>
> >> Goal 2: Introduce a few very little signatures to handle those cases
> >> that really cannot be mapped to a non-default signature.
> >>
> >> Provide good rationales that you can defend and provide documentation
> >> for when checkpatch shall explain the fix it proposes.
> >>
> >> Here an example for the first ten cases:
> >>
> >> 1)Debugged-by: 61 -> Codeveloped-by:
> >>
> >> Rationale: Debugging is part of Software Development; so
> >> Codeveloped-by is perfectly fine, even if the contributor did not
> >> create code.
> >>
> >> (alternatively: maybe a new Assisted-by would do here.)
> >>
> >> 2)Requested-by: 48 -> Suggested-by:
> >>
> >> Rationale: In an open-source project, there are "no requests", just
> >> "suggestions" to convince a maintainer to accept your patch.
> >>
> >> 3)Co-authored-by: 43 -> Codeveloped-by:
> >>
> >> Rationale: clear. Codeveloped-by and Co-authored-by are synonyms.
> >>
> >> 4)Originally-by: 39
> >>
> >> Maybe something like this deserves to be a new tag. There is a
> >> significant difference to codeveloped-by. But that needs discussion.
> >>
> >> 5)Analyzed-by: 22
> >>
> >> Rationale: Analyzing is part of Software Development; so
> >> Codeveloped-by is perfectly fine, even if the contributor did not
> >> create code.
> >> (alternatively: maybe a new Assisted-by would do here.)
> >>
> >> 6)Bisected-by: 20
> >>
> >> Difficult...
> >> (maybe a new Assisted-by would do here.)
> >>
> >> 7)Improvements-by: 19 -> Codeveloped-by:
> >>
> >> 8)Generated-by: 17 -> Reported-by: ?
> >>
> >> What does generated-by actually mean?
> >>
> >> 9)Noticed-by: 11 -> Reported-by:
> >>
> >> 10)Inspired-by: 11 -> Suggested-by:
> >>
> >> Maybe you can come up with a list for the next twenty and then we
> >> discuss them with Joe Perches and then a larger group?
> >>
>
> This is the list for next 20:
>
> 11)Original-patch-by: 11 -> co-developed-by / Originally-by (a new
> signoff)
> Rationale: I checked mailing list for one of these signoffs.
> Link1:
>
> https://lore.kernel.org/linux-perf-users/20190221122306.1511-1-jonas.rabenstein@studium.uni-erlangen.de/
> Link2
> <https://lore.kernel.org/linux-perf-users/20190221122306.1511-1-jonas.rabenstein@studium.uni-erlangen.de/Link2>
> :
>
> https://lore.kernel.org/linux-perf-users/20190307174433.28819-32-acme@kernel.org/
>
> Here it seems like someone who started working on the patch but
> couldn't complete it, but still has
> significant contribution in the patch.
> Maybe signing off as codeveloper suffices the purpose. I'm not sure though
>
> 12)Diagnosed-by: 11 -> Maybe 'Reviewed-by' or 'Acked-by'
> Rationale: Observed a few mailing lists, eg here:
> https://lore.kernel.org/lkml/20190609164128.000227333@linuxfoundation.org/
> But could not decide as the user is not adding it along the mails, but
> seems like a maintainer.
>
> 13)Based-on-a-patch-by: 8 -> Similar to 'Originally-by'
>
> 14)Verified-by: 8 -> Tested-by
> Rationale: Used by a single user. On reading, mailing list, it seems
> that 'Tested-by' tag might be a suitable alternative.
> Link:
>
> https://lore.kernel.org/lkml/CA+jURcugFhSt9GGRZELQUCnupOf2Ns96Ao5ZruWfVtq=z_7ytw@mail.gmail.com/
>
> 15)Okay-ished-by: 8 -> Acked-by
> Rationale: Used by a single user. On reading, mailing list, it seems
> that 'Acked-by' tag might be a suitable alternative.
> Link:
>
> https://lore.kernel.org/lkml/f06e74e9a38b83ec273196bce727295b828c5870.1507769413.git.rgb@redhat.com/
>
> 16)Based-on-patch-by: 7 -> Similar to (13) Based-on-a-patch-by
>
> 17)Root-caused-by: 6 -> Maybe 'Fixes:' followed by the commit it is
> fixing.
> Rationale: Going through mailing list, it comes up added with the
> patch. So I couldn't be sure
>
> 18)Original-by: 6 -> Similar to '(4)Originally-by'
>
> 19)Acked-for-MFD-by: 6 -> Acked-by:
>
> 20)Reviewed-off-by: 5 -> Reviewed-by:
>
> 21)Based-on-patches-by: 5 -> Similar to (13)
>
> 22)Analysed-by: 5 -> Co-developed-by/Reviewed-by
> Rationale: Similar to '(5)Analyzed-by'
>
> 23)Based-on-work-by: 5 -> Not sure. Maybe 'Suggested-by'
>
> 24)Proposed-by: 5 -> Maybe 'Suggested-by'
> Rationale: The tag comes up added with the patch,and the user is also
> given the tag 'Signed-off-by', but does not seem to participate in the
> conversation.
> Maybe he is a maintainer, who suggested the patch.
> mailing list:
> https://lore.kernel.org/linux-nvme/20200501212545.21856-3-sagi@grimberg.me/
>
> 25)Reported-and-bisected-by: 4 -> Two different tags: 'Reported-by:'
> and 'Bisected-by'
>
> 26)Fixed-by: 3 -> Co-developed-by
> Rationale: I observed one of these commit conservations here:
> https://lore.kernel.org/lkml/1b45ffd1-99bb-4ac1-fb65-0de3e42c1c0a@amd.com/
> It seems like there was some bug with this patch, which was fixed by
> the user. I guess Co-developed-by should go well as alternative.
>
> 27)Pointed-out-by: 3 -> Suggested-by
> Rationale: For commit 87bd4c26a6c8 ("clocksource/drivers/tegra: Lower
> clocksource rating for some Tegra's"), this warning occurs, where
> the patch is also 'Acked-by' Peter De Schrijver. So, it seems like he
> is a maintainer who must have suggested these changes
>
> 28)Suggestions-by: 3 -> Suggested-by
>
> 29)Celebrated-by: 3 -> Might be suggested to remove
> Rationale: This tag is used for a single commit 3 times, seems like a
> tag used for celebration of a particular patch
> Link:
>
> https://lore.kernel.org/lkml/CANRm+CyonYOzGdXo+D8gr8n04=f=S92QH-HxETKnoGGxhMFREA@mail.gmail.com/
>
> 30)Pointed-at-by: 2 -> Suggested-by
> Rationale: One of these tags is named for Greg Kroah-Hartman
> <gregkh at linuxfoundation.org>, who is probably a maintainer.
> Here, the user might just want to acknowledge him for his suggestion,
> so 'Suggested-by' seems appropriate.
>
> What do you think?
>

I will provide some detailed comments in a few days, okay?

I suggest you start creating the patch that fixes tags with the low edit
distance.

Once that patch is good and accepted, we continue with the work on this
list.

Lukas


> Thanks
> Aditya
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/linux-kernel-mentees/attachments/20201118/067d04e6/attachment-0001.html>


More information about the Linux-kernel-mentees mailing list