<div><br></div><div><br><div class="gmail_quote"><div dir="auto">On Mi., 18. Nov. 2020 at 11:12, Aditya <<a href="mailto:yashsri421@gmail.com">yashsri421@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 18/11/20 2:24 am, Aditya wrote:<br>
> On 17/11/20 11:12 pm, Lukas Bulwahn wrote:<br>
>> On Tue, Nov 17, 2020 at 7:03 PM Aditya <<a href="mailto:yashsri421@gmail.com" target="_blank">yashsri421@gmail.com</a>> wrote:<br>
>>><br>
>>> On 13/11/20 11:55 pm, Aditya wrote:<br>
>>>> On 13/11/20 8:56 pm, Lukas Bulwahn wrote:<br>
>>>>> On Fri, Nov 13, 2020 at 4:00 PM Aditya <<a href="mailto:yashsri421@gmail.com" target="_blank">yashsri421@gmail.com</a>> wrote:<br>
>>>>>><br>
>>>>>> On 13/11/20 8:05 pm, Aditya wrote:<br>
>>>>>>> On 12/11/20 1:34 am, Lukas Bulwahn wrote:<br>
>>>>>>>> On Wed, Nov 11, 2020 at 3:13 PM Aditya <<a href="mailto:yashsri421@gmail.com" target="_blank">yashsri421@gmail.com</a>> wrote:<br>
>>>>>>>>><br>
>>>>>>>>> Hi Sir<br>
>>>>>>>>> I have analyzed the checkpatch report for BAD_SIGN_OFF(over<br>
>>>>>>>>> v4.13..v5.8) for non-standard signature and generated reports for it.<br>
>>>>>>>>> Some mistakes are more frequent than others, whereas some mistakes<br>
>>>>>>>>> even have a frequency of 1.<br>
>>>>>>>>><br>
>>>>>>>>> Non-standard signatures occurring with their frequency:<br>
>>>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt</a><br>
>>>>>>>>><br>
>>>>>>>>> Complete warning messages:<br>
>>>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/warn_msgs.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/warn_msgs.txt</a><br>
>>>>>>>>><br>
>>>>>>>>> Should I implement the fix similar to TYPO_FIX, where we have a<br>
>>>>>>>>> separate file for common misspellings and corrected words? Or should I<br>
>>>>>>>>> make a hash of these misspellings in <a href="http://checkpatch.pl" rel="noreferrer" target="_blank">checkpatch.pl</a> file as well?<br>
>>>>>>>>><br>
>>>>>>>>> Also should I include all these misspelled words in it? Or omit words<br>
>>>>>>>>> below certain frequency?<br>
>>>>>>>>><br>
>>>>>>>><br>
>>>>>>>> I think the best way would be to compute some kind of edit distance to<br>
>>>>>>>> the known signature tags and if this edit distance is below a certain<br>
>>>>>>>> threshold, suggest that signature tag as the fix. We can then evaluate<br>
>>>>>>>> to determine the best suitable threshold. The edit distance between<br>
>>>>>>>> the different tags are so large that this should always work as<br>
>>>>>>>> intended.<br>
>>>>>>>><br>
>>>>>>>> Then, we can look into these other creative tags and propose suitable<br>
>>>>>>>> existing tags for the more frequent ones that are non-standard. Or in<br>
>>>>>>>> the case, none of the existing ones fit we can start the discussion on<br>
>>>>>>>> proposing some new standard ones.<br>
>>>>>>>><br>
>>>>>>><br>
>>>>>>> I have generated a list of non-standard signatures and their fixes on<br>
>>>>>>> the basis of edit distance.<br>
>>>>>>><br>
>>>>>>> This is the common list of non standard signatures and fixes (in<br>
>>>>>>> detail):<br>
>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/min_dists.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/min_dists.txt</a><br>
>>>>>>><br>
>>>>>>> As I observed, I think, we can consider '<=2' as the threshold edit<br>
>>>>>>> distance.<br>
>>>>>>> List for non-standard signature and their proposed fix with edit<br>
>>>>>>> distance<=2 :<br>
>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than_3.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than_3.txt</a><br>
>>>>>>><br>
>>>>>>> I have also generated lists for 3 and 4 edit distance separately for<br>
>>>>>>> reference:<br>
>>>>>>> Equal to 3:<br>
>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_3.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_3.txt</a><br>
>>>>>>><br>
>>>>>>> Equal to 4:<br>
>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_4.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_4.txt</a><br>
>>>>>>><br>
>>>>>>> For the rest I guess we'll need to hard code eg. for 'Debugged-by',<br>
>>>>>>> 'Requested-by' etc.<br>
>>>>>>><br>
>>>>>>> These are the complete lists of non-standard signatures:<br>
>>>>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt</a><br>
>>>>>>><br>
>>>>><br>
>>>>> Can you share which non-standard-signatures would be<br>
>>>>> handled/transformed with edit distance 2 and which would not in a<br>
>>>>> similar format to non_standard_signs.txt (so, ordered by frequency).<br>
>>>>><br>
>>>>> We can then consider those that remain and find a good next strategy<br>
>>>>> for the most frequent non-standard signatures.<br>
>>>>><br>
>>>><br>
>>>> Non standard signatures handled with edit distance 2:<br>
>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt</a><br>
>>>><br>
>>>> Non standard signatures with edit distance greater than 2:<br>
>>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/tree/master/random/non_standard_signature/more_than2" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/tree/master/random/non_standard_signature/more_than2</a><br>
>>>><br>
>>><br>
>>> I think this mail probably got missed. I'll summarize it a bit for<br>
>>> simplicity:<br>
>>> With edit distance approach and threshold as 2, we're able to handle<br>
>>> 39 out of 109 'distinct' cases of non-standard signature. In this 39,<br>
>>> the maximum count of non-standard signature is 19 for 'Reviwed-by:'; 9<br>
>>> for 'Reviewd-by:' and other common mispellings.<br>
>>> Complete List:<br>
>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt</a><br>
>>><br>
>>> However, still we are unable to account for 70 non-standard signatures<br>
>>> which occur more frequently (eg 'Debugged-by:', which has occurred 61<br>
>>> times; 'Requested-by:', 48 times; and so on).<br>
>>> Complete list:<br>
>>> <a href="https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/more_than2/signs_freq.txt" rel="noreferrer" target="_blank">https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/more_than2/signs_freq.txt</a><br>
>>><br>
>>> I think for these cases we'd need to make some file (as is used for<br>
>>> TYPO_SPELLING), or hash.<br>
>>> What do you think/suggest?<br>
>>><br>
>><br>
>> Yes, I agree.<br>
>><br>
>> Goal 1: Try to map all the non-default signatures to their "standard"<br>
>> counterpart as much as possible.<br>
>><br>
>> Goal 2: Introduce a few very little signatures to handle those cases<br>
>> that really cannot be mapped to a non-default signature.<br>
>><br>
>> Provide good rationales that you can defend and provide documentation<br>
>> for when checkpatch shall explain the fix it proposes.<br>
>><br>
>> Here an example for the first ten cases:<br>
>><br>
>> 1)Debugged-by: 61 -> Codeveloped-by:<br>
>><br>
>> Rationale: Debugging is part of Software Development; so<br>
>> Codeveloped-by is perfectly fine, even if the contributor did not<br>
>> create code.<br>
>><br>
>> (alternatively: maybe a new Assisted-by would do here.)<br>
>><br>
>> 2)Requested-by: 48 -> Suggested-by:<br>
>><br>
>> Rationale: In an open-source project, there are "no requests", just<br>
>> "suggestions" to convince a maintainer to accept your patch.<br>
>><br>
>> 3)Co-authored-by: 43 -> Codeveloped-by:<br>
>><br>
>> Rationale: clear. Codeveloped-by and Co-authored-by are synonyms.<br>
>><br>
>> 4)Originally-by: 39<br>
>><br>
>> Maybe something like this deserves to be a new tag. There is a<br>
>> significant difference to codeveloped-by. But that needs discussion.<br>
>><br>
>> 5)Analyzed-by: 22<br>
>><br>
>> Rationale: Analyzing is part of Software Development; so<br>
>> Codeveloped-by is perfectly fine, even if the contributor did not<br>
>> create code.<br>
>> (alternatively: maybe a new Assisted-by would do here.)<br>
>><br>
>> 6)Bisected-by: 20<br>
>><br>
>> Difficult...<br>
>> (maybe a new Assisted-by would do here.)<br>
>><br>
>> 7)Improvements-by: 19 -> Codeveloped-by:<br>
>><br>
>> 8)Generated-by: 17 -> Reported-by: ?<br>
>><br>
>> What does generated-by actually mean?<br>
>><br>
>> 9)Noticed-by: 11 -> Reported-by:<br>
>><br>
>> 10)Inspired-by: 11 -> Suggested-by:<br>
>><br>
>> Maybe you can come up with a list for the next twenty and then we<br>
>> discuss them with Joe Perches and then a larger group?<br>
>><br>
<br>
This is the list for next 20:<br>
<br>
11)Original-patch-by: 11 -> co-developed-by / Originally-by (a new<br>
signoff)<br>
Rationale: I checked mailing list for one of these signoffs.<br>
Link1:<br>
<a href="https://lore.kernel.org/linux-perf-users/20190221122306.1511-1-jonas.rabenstein@studium.uni-erlangen.de/Link2" rel="noreferrer" target="_blank">https://lore.kernel.org/linux-perf-users/20190221122306.1511-1-jonas.rabenstein@studium.uni-erlangen.de/<br>
Link2</a>:<br>
<a href="https://lore.kernel.org/linux-perf-users/20190307174433.28819-32-acme@kernel.org/" rel="noreferrer" target="_blank">https://lore.kernel.org/linux-perf-users/20190307174433.28819-32-acme@kernel.org/</a><br>
<br>
Here it seems like someone who started working on the patch but<br>
couldn't complete it, but still has<br>
significant contribution in the patch.<br>
Maybe signing off as codeveloper suffices the purpose. I'm not sure though<br>
<br>
12)Diagnosed-by: 11 -> Maybe 'Reviewed-by' or 'Acked-by'<br>
Rationale: Observed a few mailing lists, eg here:<br>
<a href="https://lore.kernel.org/lkml/20190609164128.000227333@linuxfoundation.org/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/20190609164128.000227333@linuxfoundation.org/</a><br>
But could not decide as the user is not adding it along the mails, but<br>
seems like a maintainer.<br>
<br>
13)Based-on-a-patch-by: 8 -> Similar to 'Originally-by'<br>
<br>
14)Verified-by: 8 -> Tested-by<br>
Rationale: Used by a single user. On reading, mailing list, it seems<br>
that 'Tested-by' tag might be a suitable alternative.<br>
Link:<br>
<a href="https://lore.kernel.org/lkml/CA+jURcugFhSt9GGRZELQUCnupOf2Ns96Ao5ZruWfVtq=z_7ytw@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/CA+jURcugFhSt9GGRZELQUCnupOf2Ns96Ao5ZruWfVtq=z_7ytw@mail.gmail.com/</a><br>
<br>
15)Okay-ished-by: 8 -> Acked-by<br>
Rationale: Used by a single user. On reading, mailing list, it seems<br>
that 'Acked-by' tag might be a suitable alternative.<br>
Link:<br>
<a href="https://lore.kernel.org/lkml/f06e74e9a38b83ec273196bce727295b828c5870.1507769413.git.rgb@redhat.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/f06e74e9a38b83ec273196bce727295b828c5870.1507769413.git.rgb@redhat.com/</a><br>
<br>
16)Based-on-patch-by: 7 -> Similar to (13) Based-on-a-patch-by<br>
<br>
17)Root-caused-by: 6 -> Maybe 'Fixes:' followed by the commit it is<br>
fixing.<br>
Rationale: Going through mailing list, it comes up added with the<br>
patch. So I couldn't be sure<br>
<br>
18)Original-by: 6 -> Similar to '(4)Originally-by'<br>
<br>
19)Acked-for-MFD-by: 6 -> Acked-by:<br>
<br>
20)Reviewed-off-by: 5 -> Reviewed-by:<br>
<br>
21)Based-on-patches-by: 5 -> Similar to (13)<br>
<br>
22)Analysed-by: 5 -> Co-developed-by/Reviewed-by<br>
Rationale: Similar to '(5)Analyzed-by'<br>
<br>
23)Based-on-work-by: 5 -> Not sure. Maybe 'Suggested-by'<br>
<br>
24)Proposed-by: 5 -> Maybe 'Suggested-by'<br>
Rationale: The tag comes up added with the patch,and the user is also<br>
given the tag 'Signed-off-by', but does not seem to participate in the<br>
conversation.<br>
Maybe he is a maintainer, who suggested the patch.<br>
mailing list:<br>
<a href="https://lore.kernel.org/linux-nvme/20200501212545.21856-3-sagi@grimberg.me/" rel="noreferrer" target="_blank">https://lore.kernel.org/linux-nvme/20200501212545.21856-3-sagi@grimberg.me/</a><br>
<br>
25)Reported-and-bisected-by: 4 -> Two different tags: 'Reported-by:'<br>
and 'Bisected-by'<br>
<br>
26)Fixed-by: 3 -> Co-developed-by<br>
Rationale: I observed one of these commit conservations here:<br>
<a href="https://lore.kernel.org/lkml/1b45ffd1-99bb-4ac1-fb65-0de3e42c1c0a@amd.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/1b45ffd1-99bb-4ac1-fb65-0de3e42c1c0a@amd.com/</a><br>
It seems like there was some bug with this patch, which was fixed by<br>
the user. I guess Co-developed-by should go well as alternative.<br>
<br>
27)Pointed-out-by: 3 -> Suggested-by<br>
Rationale: For commit 87bd4c26a6c8 ("clocksource/drivers/tegra: Lower<br>
clocksource rating for some Tegra's"), this warning occurs, where<br>
the patch is also 'Acked-by' Peter De Schrijver. So, it seems like he<br>
is a maintainer who must have suggested these changes<br>
<br>
28)Suggestions-by: 3 -> Suggested-by<br>
<br>
29)Celebrated-by: 3 -> Might be suggested to remove<br>
Rationale: This tag is used for a single commit 3 times, seems like a<br>
tag used for celebration of a particular patch<br>
Link:<br>
<a href="https://lore.kernel.org/lkml/CANRm+CyonYOzGdXo+D8gr8n04=f=S92QH-HxETKnoGGxhMFREA@mail.gmail.com/" rel="noreferrer" target="_blank">https://lore.kernel.org/lkml/CANRm+CyonYOzGdXo+D8gr8n04=f=S92QH-HxETKnoGGxhMFREA@mail.gmail.com/</a><br>
<br>
30)Pointed-at-by: 2 -> Suggested-by<br>
Rationale: One of these tags is named for Greg Kroah-Hartman<br>
<<a href="mailto:gregkh@linuxfoundation.org" target="_blank">gregkh@linuxfoundation.org</a>>, who is probably a maintainer.<br>
Here, the user might just want to acknowledge him for his suggestion,<br>
so 'Suggested-by' seems appropriate.<br>
<br>
What do you think?<br>
</blockquote><div dir="auto"><br></div><div dir="auto">I will provide some detailed comments in a few days, okay?</div><div dir="auto"><br></div><div dir="auto">I suggest you start creating the patch that fixes tags with the low edit distance.</div><div dir="auto"><br></div><div dir="auto">Once that patch is good and accepted, we continue with the work on this list.</div><div dir="auto"><br></div><div dir="auto">Lukas</div><div dir="auto"><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Thanks<br>
Aditya<br>
</blockquote></div></div>