[Linux-kernel-mentees] [PATCH] checkpatch: Fix SPDX license check for scripts

Lukas Bulwahn lukas.bulwahn at gmail.com
Fri Jul 24 08:09:25 UTC 2020



On Fri, 24 Jul 2020, Mrinal Pandey wrote:

> 
> 
> On Wed, Jul 22, 2020 at 3:51 PM Lukas Bulwahn <lukas.bulwahn at gmail.com> wrote:
> 
> 
>       On Tue, 21 Jul 2020, Mrinal Pandey wrote:
> 
>       > In all the scripts, the SPDX license should be on the second line,
>       > the first line being the shebang, but checkpatch issues a warning
>       > "Misplaced SPDX-License-Identifier tag - use line 1 instead" for the
>       > scripts that have SPDX license in the second line.
>       >
> 
>       That what you wrote is actually wrong. Checkpatch only wrongly warns if
>       the first line is not included in the diff, right? Otherwise, it does not
>       warn as intended.
> 
> 
> Sir,
> 
> Yes. I would modify the description accordingly.
> 
>       > However, this warning is not issued when checkpatch is run on a file.
>       > The case for files has been handled gracefully by checking first line of
>       > the file to be a shebang and then setting `$checklicenseline` to `2`but
>       > this doesn't work when we don't have shebang in diff content of a patch
>       > and `$checklicenseline` continues to be `1` in such cases. Therefore,
>       > checkpatch expects the line `1` to contain the SPDX license when it
>       > should have been `2` instead.
>       >
> 
>       It is described very complicated here. Maybe think about rephrasing it, so
>       it becomes more clear.
> 
> 
>       > I noticed this false positive while running checkpatch on the set of
>       > commits from v5.7 to v5.8-rc1 of the kernel on the commits which modified
>       > a script file.
>       >
> 
>       How about naming the commits and providing a statistics how often that
>       happens?
> 
>       > This false positive exists in checkpatch since commit a8da38a9cf0e
>       > ("checkpatch: add test for SPDX-License-Identifier on wrong line #")
>       > when the corresponding rule was first commited.
>       >
>       > Fix this by setting `$checklicenseline` to `2` whenever the file or diff
>       > content we are checking comes from a script instead of checking first
>       > line to be a shebang, thus, informing checkpatch that the SPDX license
>       > should be expected on the second line.
>       >
> 
>       Well, now the critical part:
> 
>       It seems that this now makes the situation worse compared to before.
> 
>       I quickly ran this evaluation:
> 
>       $ cat first-line.sh
>       #!/bin/bash
> 
>       for file in $(git ls-files)
>       do
>               firstline="$(head --silent -n 1 $file)"
>               echo "$file: $firstline"
>       done
> 
>       $ sh ./first-line.sh > first-lines
> 
>       $ grep "#!" first-lines | wc -l
>       810
> 
>       $ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | grep -v "\."
> 
>       $ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | grep -v "\." | wc
>       -l
>       120
> 
>       $ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | rev | cut -d"/"
>       -f1 | rev | grep  "\." | cut -d"." -f2 | sort | uniq -c | sort -nr
>           509 sh
>            91 tc
>            45 pl
>            37 py
>             6 awk
>             1 config
> 
>       So, in 120 cases you actually would now warn on line 1 where the should be
>       a shebang as intended.
> 
>       Also, I cross-checked .yaml files:
> 
>       $ grep "\.yaml:" first-lines | grep "SPDX-License-Identifier"  | wc -l
>       1035
> 
>       $ grep "\.yaml:" first-lines | wc -l
>       1037
> 
> 
>       So, for yaml, SPDX is actually in line 1.
> 
> 
> Yes, I should remove the `yaml` files from this check.
> 
> 
>       I guess you need to think about this a bit more.
> 
> 
> If I am correct now we have an issue with all those files which are scripts without an extension.

Yes, correct.

> In this case, the earlier logic which I removed was working fine(now I get an idea of why it was like
> that in the first place). I think the best way is to keep both my suggested change and the logic that
> already exists so that we can at least avoid those false positives when we know the extension without
> essentially duplicating the logic.

You need to combine the previous heuristic with your further extension, 
making it overall better and try to keep the logic checking as simple as 
possible.

If the first line is there and a shebang, use it to make a decision. If 
the first line is not available, use another heuristic.

> Checking for files with no extension, no shebang and only SPDX in diff is tricky since we can't actually open the file.

You cannot open the file, but if a first line is available, you need to 
use that information.

I suggest that you evaluate your patch on a set of use cases with a clear 
report what was reported before and after applying your patch.

Lukas


> Please let me know your views.
>  
> Thank you.
> 
> 
>       > Signed-off-by: Mrinal Pandey <mrinalmni at gmail.com>
>       > ---
>       >  scripts/checkpatch.pl | 7 ++++---
>       >  1 file changed, 4 insertions(+), 3 deletions(-)
>       >
>       > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>       > index 4c820607540b..bdd2f9a80891 100755
>       > --- a/scripts/checkpatch.pl
>       > +++ b/scripts/checkpatch.pl
>       > @@ -3166,10 +3166,11 @@ sub process {
>       >               }
>>       >  # check for using SPDX license tag at beginning of files
>       > +             if ($realfile =~ /.*\.\(py\|sh\|pl\|awk\|tc\|yaml\)/) {
>       > +                     $checklicenseline = 2;
>       > +             }
>       >               if ($realline == $checklicenseline) {
>       > -                     if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {
>       > -                             $checklicenseline = 2;
>       > -                     } elsif ($rawline =~ /^\+/) {
>       > +                     if ($rawline =~ /^\+/) {
>       >                               my $comment = "";
>       >                               if ($realfile =~ /\.(h|s|S)$/) {
>       >                                       $comment = '/*';
>       > --
>       > 2.25.1
>       >
>       >
> 
> 
> 


More information about the Linux-kernel-mentees mailing list