[Linux-kernel-mentees] checkpatch.pl improvement: NO_AUTHOR_SIGN_OFF warnings for users with multiple emails

Dwaipayan Ray dwaipayanray1 at gmail.com
Mon Sep 21 12:11:36 UTC 2020


Hi Joe and Lukas and others,
I would like to elaborate a bit on the issue and the solution I
have thought of for fixing Missing Author Signed-off-by
warning for regular committers who use multiple email
addresses.

The problem:
While running checkpatch on previous commits to the kernel,
there were multiple such instances where the author had
signed off using a different email address rather than the one
which he used to mail the patch.

>From Lukas's data:

$ grep "NO_AUTHOR_SIGN_OFF" v5.4..v5.8.tsv  | cut -f 7 | sort  | uniq -c |
sort -nr | head -n 8
    175 Missing Signed-off-by: line by nominal patch author 'Daniel
Vetter <daniel.vetter at ffwll.ch>'
     68 Missing Signed-off-by: line by nominal patch author 'Trond
Myklebust <trondmy at gmail.com>'
     43 Missing Signed-off-by: line by nominal patch author 'Thinh
Nguyen <Thinh.Nguyen at synopsys.com>'
     40 Missing Signed-off-by: line by nominal patch author 'Pascal
van Leeuwen <pascalvanl at gmail.com>'
     36 Missing Signed-off-by: line by nominal patch author 'Alex
Maftei <amaftei at solarflare.com>'
     31 Missing Signed-off-by: line by nominal patch author 'Valdis
Kletnieks <valdis.kletnieks at vt.edu>'
     24 Missing Signed-off-by: line by nominal patch author 'Luke
Nelson <lukenels at cs.washington.edu>'

So most of them belong to the case where they have signed off
using a different mail address. I believe these can be handled
better.

Proposed Solution:
The .mailmap file contains mappings of the following types:
  name1 <mail1>
  <mail1> <mail2>
  name1 <mail1> <mail2>
  name1 <mail1> name2 <mail2>

Thus loading .mailmap data and matching email addresses for
the same author would resolve many of these warnings.

Now the remaining problem at hand is to have a data structure
by which this query can be handled easily without much extra
overhead.

One possible solution is while parsing the author, also load mailmap
and load associated email addresses to the author and load it
into a hash. Next when a signed-off-by line is encountered and
the email is found in our hash ( or maybe some other ds ), then
the signed-off-by match should be positive.

Is this feasible? I would be looking at other possibilities too. But it
would be great to have your view on it!

Thanks,
Dwaipayan.


More information about the Linux-kernel-mentees mailing list