[Linux-kernel-mentees] checkpatch.pl improvement: NO_AUTHOR_SIGN_OFF warnings for users with multiple emails

Lukas Bulwahn lukas.bulwahn at gmail.com
Mon Sep 21 13:23:14 UTC 2020



On Mon, 21 Sep 2020, Dwaipayan Ray wrote:

> Hi Joe and Lukas and others,
> I would like to elaborate a bit on the issue and the solution I
> have thought of for fixing Missing Author Signed-off-by
> warning for regular committers who use multiple email
> addresses.
> 
> The problem:
> While running checkpatch on previous commits to the kernel,
> there were multiple such instances where the author had
> signed off using a different email address rather than the one
> which he used to mail the patch.
> 
> From Lukas's data:
> 
> $ grep "NO_AUTHOR_SIGN_OFF" v5.4..v5.8.tsv  | cut -f 7 | sort  | uniq -c |
> sort -nr | head -n 8
>     175 Missing Signed-off-by: line by nominal patch author 'Daniel
> Vetter <daniel.vetter at ffwll.ch>'
>      68 Missing Signed-off-by: line by nominal patch author 'Trond
> Myklebust <trondmy at gmail.com>'
>      43 Missing Signed-off-by: line by nominal patch author 'Thinh
> Nguyen <Thinh.Nguyen at synopsys.com>'
>      40 Missing Signed-off-by: line by nominal patch author 'Pascal
> van Leeuwen <pascalvanl at gmail.com>'
>      36 Missing Signed-off-by: line by nominal patch author 'Alex
> Maftei <amaftei at solarflare.com>'
>      31 Missing Signed-off-by: line by nominal patch author 'Valdis
> Kletnieks <valdis.kletnieks at vt.edu>'
>      24 Missing Signed-off-by: line by nominal patch author 'Luke
> Nelson <lukenels at cs.washington.edu>'
> 
> So most of them belong to the case where they have signed off
> using a different mail address. I believe these can be handled
> better.
> 
> Proposed Solution:
> The .mailmap file contains mappings of the following types:
>   name1 <mail1>
>   <mail1> <mail2>
>   name1 <mail1> <mail2>
>   name1 <mail1> name2 <mail2>
> 
> Thus loading .mailmap data and matching email addresses for
> the same author would resolve many of these warnings.
> 
> Now the remaining problem at hand is to have a data structure
> by which this query can be handled easily without much extra
> overhead.
> 
> One possible solution is while parsing the author, also load mailmap
> and load associated email addresses to the author and load it
> into a hash. Next when a signed-off-by line is encountered and
> the email is found in our hash ( or maybe some other ds ), then
> the signed-off-by match should be positive.
> 
> Is this feasible? I would be looking at other possibilities too. But it
> would be great to have your view on it!
>

This sounds like a plan.

I expect that this task would roughly take you 40 hours of work to get
a first patch ready to the state for review at lkml.

Please use the linux-kernel-mentees mailing list for early versions of 
your work before that.

Then, I expect another 40 hours of work to get all reviewers/maintainers 
happy and get it towards final acceptance.

I suggest that for the first mentorship milestone, we also add to have
some basic documentation and some patches directed to the developers
we can already see with special setups, e.g., Daniel Vetter, with suitable 
entries in .mailmap for them. They can then ack those .mailmap patches and 
integrate them.

Let us aim to have 10 patches for those most regular developers sent out 
and at least one being accepted.

Would you agree to that for the first milestone?

If so, please state here and provide all needed information in the 
community bridge system. Then, we will proceed in the system, such that 
you get an official go and you can start fleshing out these patches.


Lukas


More information about the Linux-kernel-mentees mailing list