[Linux-kernel-mentees] Linux kernel checkpatch.pl mentorship

Lukas Bulwahn lukas.bulwahn at gmail.com
Wed Sep 16 07:01:41 UTC 2020



On Tue, 15 Sep 2020, Dwaipayan Ray wrote:

> Hi,
> Sorry for the late reply.
> 
> > First explain:
> >> - which situations does checkpatch.pl currently complain about?
> >
> Currently, checkpatch complains whenever the author mail is not
> found in any signed-off-by block in the patch.
> 

Generally, this makes sense, right?

The author shall also sign-off the patch.

Now, the question is how to determine identities?

We have two bits of information, the name and the email.

So, there are three options to define 'identity':

A. the name and the email needs to match.
B. at least, the name needs to match
C. at least, the email needs to match

I believe A. is perfect; B and C deserve at least a note from 
checkpatch.pl.

> 
> > - for which situation do you want to have more refined checks?
> >
> The situations where author might have signed off using a different
> email. I believe multiple mail addresses isn't uncommon.
>

Yes, but who is doing it on purpose, and who is doing it by mistake.

We need numbers. How often is A met? How often is B met? How often is C 
met?

We can improve the identity check, but it probably needs more thought than 
the email does not need to match, but the name does. That will probably 
just show more cases where the name does not match, but the email does.

> 
> > - why does that actually improve checkpatch.pl?
> 
> It shall significantly reduce the number of author_sign_off warnings. I have
> not yet created a statistical count, but looking at the data I found several
> such instances. This is certainly a false positive due to a condition which
> checkpatch was not programmed to handle.
> The avoidance of warnings on such known cases might also save the 
> committer and the maintainers some time.
>

Hmm, not really a good rationale yet. How about deleting checkpatch.pl 
completely? Then it cannot complain either. No false positives, no problem 
:)

Maybe it was not programmed to handle that on purpose, e.g., it was a 
clear design decision to check if at least the email is met.

Check the git history of checkpatch.pl with git blame and you will know 
more. You need to understand the history of this check and warning.

> 
> > Checkpatch.pl should complain when developers do something wrong.
> >
> > To really understand what is wrong behavior and what is not, you probably
> > need to create some statistics on who authors and signs off with which
> > names and email addresses.
> >
> > Maybe we can collect all the previous instances where we know that
> > frequent developers, e.g., with more >100 commits, use multiple email
> > addresses interchangeably. If we add that list to the repository and
> > let others know how to maintain it, checkpatch.pl can make use of that.
> >
> > With that extended check, we can warn newbies that just have a broken git
> > and sign-off setup and still reduce the false positives for the
> > experienced developers that might just have good reasons to have the
> > setup they have, i.e., they have this setup for many years and want to
> > keep it that way.
> 
> This seems like a great idea. I can load the mailmap data into checkpatch
> and form some kind of map between names and mail addresses. 
> If two mail addresses belong to same user then the warning can be ignored
> totally.
>

Please check with an evaluation if that makes a big difference, though.
We have other sources to determine identities as well.
 
> I know that the kernel community is strict about such changes. So will this be
> acceptible? I can generate a proof of concept patch with the data at hand
> if it seems like a good thing to work on.
>

If you do your homework, proper research what was decided in the past, 
proper evaluations what the difference of your change is, proper 
implementation, proper arguments for your change, it has high chances of 
being accepted. Many agree that checkpatch.pl can be useful, but many 
agree that it needs some improvements.

It is certainly not a quick improvement, and needs some thought to make it 
really better.

> 
> > You can try to work that through or look for another case of potential
> > checkpatch.pl improvement in your evaluation data.
> 
> I haven't found anything substantial yet. I will continue looking. 
> Earlier, you had told if I would like to take the task from Ayush to
> fix checkpatch with git ranges. I would like to know about the task
> and take it up if possible.
>

Please reach out to Ayush to understand the encountered issue and CC: this 
mailing list.

I know there are more issues that checkpatch.pl can be improved with, keep 
looking :)


Lukas


More information about the Linux-kernel-mentees mailing list