[Linux-kernel-mentees] Fix for BAD_SIGN_OFF: non-standard signature

Aditya yashsri421 at gmail.com
Fri Nov 13 18:25:45 UTC 2020


On 13/11/20 8:56 pm, Lukas Bulwahn wrote:
> On Fri, Nov 13, 2020 at 4:00 PM Aditya <yashsri421 at gmail.com> wrote:
>>
>> On 13/11/20 8:05 pm, Aditya wrote:
>>> On 12/11/20 1:34 am, Lukas Bulwahn wrote:
>>>> On Wed, Nov 11, 2020 at 3:13 PM Aditya <yashsri421 at gmail.com> wrote:
>>>>>
>>>>> Hi Sir
>>>>> I have analyzed the checkpatch report for BAD_SIGN_OFF(over
>>>>> v4.13..v5.8) for non-standard signature and generated reports for it.
>>>>> Some mistakes are more frequent than others, whereas some mistakes
>>>>> even have a frequency of 1.
>>>>>
>>>>> Non-standard signatures occurring with their frequency:
>>>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt
>>>>>
>>>>> Complete warning messages:
>>>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/warn_msgs.txt
>>>>>
>>>>> Should I implement the fix similar to TYPO_FIX, where we have a
>>>>> separate file for common misspellings and corrected words? Or should I
>>>>> make a hash of these misspellings in checkpatch.pl file as well?
>>>>>
>>>>> Also should I include all these misspelled words in it? Or omit words
>>>>> below certain frequency?
>>>>>
>>>>
>>>> I think the best way would be to compute some kind of edit distance to
>>>> the known signature tags and if this edit distance is below a certain
>>>> threshold, suggest that signature tag as the fix. We can then evaluate
>>>> to determine the best suitable threshold. The edit distance between
>>>> the different tags are so large that this should always work as
>>>> intended.
>>>>
>>>> Then, we can look into these other creative tags and propose suitable
>>>> existing tags for the more frequent ones that are non-standard. Or in
>>>> the case, none of the existing ones fit we can start the discussion on
>>>> proposing some new standard ones.
>>>>
>>>
>>> I have generated a list of non-standard signatures and their fixes on
>>> the basis of edit distance.
>>>
>>> This is the common list of non standard signatures and fixes (in
>>> detail):
>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/min_dists.txt
>>>
>>> As I observed, I think, we can consider '<=2' as the threshold edit
>>> distance.
>>> List for non-standard signature and their proposed fix with edit
>>> distance<=2 :
>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than_3.txt
>>>
>>> I have also generated lists for 3 and 4 edit distance separately for
>>> reference:
>>> Equal to 3:
>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_3.txt
>>>
>>> Equal to 4:
>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/equal_4.txt
>>>
>>> For the rest I guess we'll need to hard code eg. for 'Debugged-by',
>>> 'Requested-by' etc.
>>>
>>> These are the complete lists of non-standard signatures:
>>> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/non_standard_signs.txt
>>>
> 
> Can you share which non-standard-signatures would be
> handled/transformed with edit distance 2 and which would not in a
> similar format to non_standard_signs.txt (so, ordered by frequency).
> 
> We can then consider those that remain and find a good next strategy
> for the most frequent non-standard signatures.
> 

Non standard signatures handled with edit distance 2:
https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/non_standard_signature/less_than2/signs_freq.txt

Non standard signatures with edit distance greater than 2:
https://github.com/AdityaSrivast/kernel-tasks/tree/master/random/non_standard_signature/more_than2

Thanks
Aditya


More information about the Linux-kernel-mentees mailing list