[Linux-kernel-mentees] [PATCH v2] checkpatch: improve email parsing

Lukas Bulwahn lukas.bulwahn at gmail.com
Wed Nov 4 08:05:29 UTC 2020

On Tue, 3 Nov 2020, Joe Perches wrote:

> On Tue, 2020-11-03 at 09:10 +0100, Lukas Bulwahn wrote:
> > Maybe you can coordinate among each other who would want to create
> > suitable fix rules here?
> Yes please.
> > Also, start with the class of the most frequent mistakes for
> > unexpected content after email addresses.
> > 
> > I imagine that a maintainer can simply run a tag sanitizing script
> > which just cleans up those stupid mistakes before creating their git
> > trees or sending git pulls to Linus.
> Does anyone really do that?
> It generally requires rebasing or post processing each commit after
> being committed before another commit occurs.

As far as I know from private converations, some maintainers do have 
early testing branches, rebase those, squash commits and hence, they 
might also sanitize the 'commit messages' if it works reliably; but I am 
not a maintainer. I guess we implement something useful and then ask some 
early-adopting maintainers to give it a spin and get some feedback.

> > Let us try to add these
> > sanitizing rules to checkpatch.pl with fix options for now; if that
> > sanitizing feature becomes a monster script of its own within
> > checkpatch.pl, we can refactor that into an independent script for
> > cleaning up.
> I rather doubt an independent script is going to be worthwhile
> as these rules shouldn't be all that complex.

Good to know. Okay, so let us add the rules and corresponding fix options 
to checkpatch.pl.

> The only prefixes acceptable for a stable address should be
> CC:|Cc:|cc:.  There are 2 uses in the last 100k commits for
> Signed-off-by: and Acked-by: with stable addresses, those should have a
> message/warning emitted in the future.
> The forms used with those cc: stable addresses:
> 2777	stable without comment
> 1381	stable # comment
> 74	stable [ comment ]
> So I suggest standardizing on no comment and # comment with any other
> style getting a warning.
> For non-stable <foo>-by: and cc: addresses and other signatures:
> Likely any content after a email address other than a parenthesized
> block should have some checkpatch message emitted.
> This should be OK:
> Signed-off-by: Full Name (comment) <address at domain.tld> (maintainer:...)
> But perhaps this should not be OK:
> Signed-off-by: Full Name (comment) <address at domain.tld> # comment
> There are 316 uses of this # comment style in the last 100k commits
> and 103 with (comment) after the address.
> Maybe the # use should be ok, maybe not.
> And anyone that uses a multiple comments in a name or a even
> a single comment in the email address should also get warned.
> The below should not be OK even if actually valid address forms:
> Signed-off-by: Full (comment1) Name (comment2) <address at domain.tld>
> Signed-off-by: Full Name <address@(comment)domain.tld>

Thanks for your evaluation and hints.
I agree with them as well. Let us try to establish one common way from 

Dwaipayan, if you code this into checkpatch.pl, maybe you can also add 
some hints on conventions for tags in the kernel (process) documentation 
to explain the rules and conventions we think make sense.


More information about the Linux-kernel-mentees mailing list