[Linux-kernel-mentees] [PATCH] checkpatch: Fix SPDX license check for scripts

Lukas Bulwahn lukas.bulwahn at gmail.com
Thu Jul 16 05:31:11 UTC 2020


On Thu, Jul 16, 2020 at 7:15 AM Mrinal Pandey <mrinalmni at gmail.com> wrote:

>
>
> On Tue, Jul 14, 2020 at 11:33 AM Lukas Bulwahn <lukas.bulwahn at gmail.com>
> wrote:
>
>>
>>
>> On Tue, 14 Jul 2020, Mrinal Pandey wrote:
>>
>> >
>> >
>> > On Tue, Jul 14, 2020 at 1:16 AM Lukas Bulwahn <lukas.bulwahn at gmail.com>
>> wrote:
>> >
>> >
>> >       On Mon, 13 Jul 2020, Mrinal Pandey wrote:
>> >
>> >       > In all the scripts, the SPDX license should be on the second
>> line,
>> >       > the first line being the "sh-bang", but checkpatch issues a
>> warning
>> >       > "Misplaced SPDX-License-Identifier tag - use line 1 instead"
>> for the
>> >       > scripts that have SPDX license in the second line.
>> >       >
>> >       > However, this warning is not issued when checkpatch is run on a
>> file using
>> >       > `-f` option. The case for files has been handled gracefully by
>> changing
>> >       > `$checklicenseline` to `2` but a corresponding check when
>> running checkpatch
>> >       > on a commit hash is missing.
>> >       >
>> >       > I noticed this false positive while running checkpatch on the
>> set of
>> >       > commits from v5.7 to v5.8-rc1 of the kernel on the commits
>> which modified
>> >       > a script file.
>> >       >
>> >       > This check is missing in checkpatch since commit a8da38a9cf0e
>> >       > ("checkpatch: add test for SPDX-License-Identifier on wrong
>> line #")
>> >       > when the corresponding rule was first commited.
>> >       >
>> >       > Fix this by setting `$checklicenseline` to `2` when the diff
>> content that
>> >       > is being checked originates from a script, thus, informing
>> checkpatch that
>> >       > the SPDX license should be on the second line.
>> >       >
>> >       > Signed-off-by: Mrinal Pandey <mrinalmni at gmail.com>
>> >       > ---
>> >       >  scripts/checkpatch.pl | 3 +++
>> >       >  1 file changed, 3 insertions(+)
>> >       >
>> >       > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>> >       > index 4c820607540b..bbffd0c4449d 100755
>> >       > --- a/scripts/checkpatch.pl
>> >       > +++ b/scripts/checkpatch.pl
>> >       > @@ -3218,6 +3218,9 @@ sub process {
>> >       >               next if ($realfile !~ /\.(h|c|s|S|sh|dtsi|dts)$/);
>> >       >
>> >       >  # check for using SPDX-License-Identifier on the wrong line
>> number
>> >       > +             if ($realfile =~ /^scripts/) {
>> >       > +                    $checklicenseline = 2;
>> >       > +             }
>> >
>> >       I think this is somehow wrong here. The check for
>> checklicenseline = 2
>> >       looks very different above.
>> >
>> >       Why does -f work and using a patch file not work?
>> >
>> >
>> > Sir,
>> >
>> > I am going to explain my observation based on file
>> `scripts/atomic/gen-atomic-fallback.sh` and
>> > commit hash `37f8173dd849`.
>> >
>> > If we are checking against the file, `checklicenseline` is set to 1 and
>> when `realline` is 1 the above
>> > `if` block is triggered, then we check if this line is of the form
>> `#!/` using the regular expression
>> > `^[ \+]\s*\#\!\s*\/`. If this is the case we set `checklicenseline` to
>> `2` informing checkpatch that it should
>> > expect license on the second line and this works all fine for a file.
>> > The `if` block below my proposed changes evaluates to false in this
>> case and thus it emits no false warning.
>> >
>> > However, If we are checking a diff content, the above `if` block is not
>> triggered at all. This is
>> > because `realline` stores the actual line number of the line we are
>> checking currently out of diff content.
>> > This value is 2 because SPDX identifier is indeed at the second line in
>> the file but `checklicenseline` is still
>> > `1`.
>> > `realline` will never become equal to 1 again and thus the above `if`
>> condition will never be true in this case.
>> > Even if the above `if` block is triggered it would not update
>> `checklicenseline` to 2 as the regular expression
>> > is not satisfied since we don't have sh-bang in diff content and just
>> the SPDX tag.
>> > If we don't do this, the `if` block below evaluates to true when
>> `realline` is 2 and `checklicensline` is `1`
>> > leading
>> > to the emission of a false warning.
>> >
>>
>> So, maybe this whole logic needs to be reworked. If you do not know the
>> first line, you need to have a different criteria in the first place
>> to determine if you expect the license tag in the first or the second,
>> e.g., the file extension, and then checking line 1 for a shebang is just
>> sanity checking. If it is of a specific file extension, you know line 1
>> and it is not a shebang, that is probably worth noting as a different
>> recommendation in checkpatch.pl anyway.
>>
>
> Sir,
>
> When we know the first line, i.e. we are running checkpatch against a
> file, the existing logic
> works fine. We probably don't want to induce any changes there.
>
>
Why not? Do you think we would break things there? Then we should not touch
the code at all.
Do you think we cannot test it properly after the change? Then we should
think about how we make a proper regression test suite for that.

But when we don't know the first line, if am not wrong, it would go
> somewhat like:
> if (the file is a script) {
>     if (the first line is shebang) {
>         if (the second line is SPDX) {
>             All good
>         } else {
>             Issue a misplaced or missing SPDX tag warning
>         }
>     } else {
>             Issue a missing shebang warning
>     }
> } else {
>     if (the first line is SPDX) {
>         All good
>     } else {
>         Issue a misplaced or missing SPDX tag warning
>     }
> }
>
>
Basically agree, but that logic applies when you know the first line as
well (and only, right?). What if you do not know the first line, how would
you check "the first line is shebang" if you do not know the first line?


The missing shebang warning probably needs to go elsewhere in the whole
script.



> Lukas
>>
>>
>> > So, what I did was to check if the diff content we are checking
>> actually comes from a script, if yes, we can set
>> > `checklicenseline` to `2` to avoid this confusion.
>> >
>>
>> Why would you think that scripts are only in scripts?
>>
>> How about first listing all files where the SPDX tag is in line 2 in the
>> current repository, e.g., v5.8-rc5?
>>
>> Then, we look at that list and determine a suitable criteria for looking
>> in line 2 for the SPDX tag.
>>
>
> Yes, the scripts are not only in scripts. I have listed all the files
> where the SPDX tag should be
> on the second line. I've attached the list for reference. We should
> probably be checking the file
> extension to determine if the tag needs to be on the second line or not.
> The documentation says the SPDX tag should be present in all source files.
> Do these source files include
> Documentation files too?
>
>
How did you create that list?
Agree (if the way you created that list makes sense). File extension seems
to cover all cases, and checking for the directory 'scripts' does not.

We might also add a further sanity check in checkpatch.pl if someone adds
an executable file that is not with extension sh, pl, or py.

Lukas

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/linux-kernel-mentees/attachments/20200716/ab67baac/attachment.html>


More information about the Linux-kernel-mentees mailing list