[Linux-kernel-mentees] [PATCH] checkpatch: fix false positive for REPEATED_WORD warning

Dwaipayan Ray dwaipayanray1 at gmail.com
Wed Oct 21 08:20:46 UTC 2020


Hey Aditya and Lukas,

> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > > index 9b9ffd876e8a..181c95691715 100755
> > > --- a/scripts/checkpatch.pl
> > > +++ b/scripts/checkpatch.pl
> > > @@ -3052,7 +3052,9 @@ sub process {
> > >
> > >  # check for repeated words separated by a single space
> > >             if ($rawline =~ /^\+/ || $in_commit_log) {
> > > -                   while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
> > > +                   # avoid repeating hex occurrences like 'ff ff fe 09 ...'
> > > +                   while ($rawline !~ /((\s)*[0-9a-z]{2}( )+){4,}/ &&

Pattern is probably wrong. It doesn't recognize word boundaries or
tabs between words. Example of the first type:

000 00 ff ff ...

The regex matches "00 00 ff ff" ignoring the first 0.

I think it could be perhaps better with something like:

 # check for repeated words separated by a single space
-               if ($rawline =~ /^\+/ || $in_commit_log) {
+               if (($rawline =~ /^\+/ || $in_commit_log) &&
+                   $rawline !~ /(?:\b(?:[0-9a-f]{2}\s+){4,})/) {
                        pos($rawline) = 1 if (!$in_commit_log);
                        while ($rawline =~ /\b($word_pattern)
(?=($word_pattern))/g) {

Please test it though. I only ran it on a few patterns.

Apart from it, this does fix the problem. But I am quite sceptical about
matching 4 or more 2 lettered words in a row. There could be counter
examples but I guess that is very rare. It's not very general, but for
the moment it does the job.

So I think it's probably good with some changes. Not sure what Joe
would have in mind though.

Lukas, I think with the changes in place, it is ready to go for discussion.

Thanks,
Dwaipayan.


More information about the Linux-kernel-mentees mailing list