[Linux-kernel-mentees] [PATCH] checkpatch: Fix SPDX license check for scripts

Lukas Bulwahn lukas.bulwahn at gmail.com
Wed Jul 22 10:20:57 UTC 2020



On Tue, 21 Jul 2020, Mrinal Pandey wrote:

> In all the scripts, the SPDX license should be on the second line,
> the first line being the shebang, but checkpatch issues a warning
> "Misplaced SPDX-License-Identifier tag - use line 1 instead" for the
> scripts that have SPDX license in the second line.
>

That what you wrote is actually wrong. Checkpatch only wrongly warns if 
the first line is not included in the diff, right? Otherwise, it does not 
warn as intended.
 
> However, this warning is not issued when checkpatch is run on a file.
> The case for files has been handled gracefully by checking first line of
> the file to be a shebang and then setting `$checklicenseline` to `2`but
> this doesn't work when we don't have shebang in diff content of a patch
> and `$checklicenseline` continues to be `1` in such cases. Therefore,
> checkpatch expects the line `1` to contain the SPDX license when it
> should have been `2` instead.
>

It is described very complicated here. Maybe think about rephrasing it, so 
it becomes more clear.

> I noticed this false positive while running checkpatch on the set of
> commits from v5.7 to v5.8-rc1 of the kernel on the commits which modified
> a script file.
>

How about naming the commits and providing a statistics how often that 
happens?
 
> This false positive exists in checkpatch since commit a8da38a9cf0e
> ("checkpatch: add test for SPDX-License-Identifier on wrong line #")
> when the corresponding rule was first commited.
> 
> Fix this by setting `$checklicenseline` to `2` whenever the file or diff
> content we are checking comes from a script instead of checking first
> line to be a shebang, thus, informing checkpatch that the SPDX license
> should be expected on the second line.
>

Well, now the critical part:

It seems that this now makes the situation worse compared to before.

I quickly ran this evaluation:

$ cat first-line.sh 
#!/bin/bash

for file in $(git ls-files)
do
	firstline="$(head --silent -n 1 $file)"
	echo "$file: $firstline"
done

$ sh ./first-line.sh > first-lines

$ grep "#!" first-lines | wc -l
810

$ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | grep -v "\."

$ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | grep -v "\." | wc 
-l
120

$ grep "#!" first-lines | sed 's/^\(.*[^\]*\):.*$/\1/' | rev | cut -d"/" 
-f1 | rev | grep  "\." | cut -d"." -f2 | sort | uniq -c | sort -nr
    509 sh
     91 tc
     45 pl
     37 py
      6 awk
      1 config

So, in 120 cases you actually would now warn on line 1 where the should be
a shebang as intended.

Also, I cross-checked .yaml files:

$ grep "\.yaml:" first-lines | grep "SPDX-License-Identifier"  | wc -l
1035

$ grep "\.yaml:" first-lines | wc -l
1037


So, for yaml, SPDX is actually in line 1.


I guess you need to think about this a bit more.


> Signed-off-by: Mrinal Pandey <mrinalmni at gmail.com>
> ---
>  scripts/checkpatch.pl | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 4c820607540b..bdd2f9a80891 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3166,10 +3166,11 @@ sub process {
>  		}
>  
>  # check for using SPDX license tag at beginning of files
> +		if ($realfile =~ /.*\.\(py\|sh\|pl\|awk\|tc\|yaml\)/) {
> +			$checklicenseline = 2;
> +		}
>  		if ($realline == $checklicenseline) {
> -			if ($rawline =~ /^[ \+]\s*\#\!\s*\//) {
> -				$checklicenseline = 2;
> -			} elsif ($rawline =~ /^\+/) {
> +			if ($rawline =~ /^\+/) {
>  				my $comment = "";
>  				if ($realfile =~ /\.(h|s|S)$/) {
>  					$comment = '/*';
> -- 
> 2.25.1
> 
> 


More information about the Linux-kernel-mentees mailing list