[Linux-kernel-mentees] Regarding "Linux Kernel: Evaluate and Improve checkpatch.pl"

Lukas Bulwahn lukas.bulwahn at gmail.com
Mon Sep 7 07:38:59 UTC 2020


On Sun, Sep 6, 2020 at 11:59 AM Ayush <ayush at disroot.org> wrote:
>
> August 31, 2020 10:44 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
>
> > On Sun, 30 Aug 2020, Ayush wrote:
> >
> >> August 27, 2020 10:59 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
> >>
> >> On Mon, 24 Aug 2020, Ayush wrote:
> >>
> >> August 22, 2020 1:36 PM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
> >>
> >> On Fri, 21 Aug 2020, Ayush wrote:
> >>
> >> Hints to the first task:
> >>
> >> Can you create a list of all non-merge commits that were added in the
> >> version v5.8 of the kernel, i.e., all non-merge commits that are in v5.8
> >> and not already in v5.7?
> >>
> >> Can you share the script/command you executed and the resulting list on
> >> github?
> >>
> >> Can you run your script on all commits of this list above and record
> >> all checkpatch.pl reports, and store them in your github repository?
> >>
> >> Can you suggest ideas how to aggregate the findings and create a
> >> statistics? For example: Which type of error is reported most?
> >> Can you implement that idea?
> >>
> >> I also suggest to have a look at
> >> the options ./scripts/checkpatch.pl --list-types and
> >> ./scripts/checkpatch.pl --show-types. The option --show-types changes
> >> the output of checkpatch.pl to list type identifiers, so it is easier
> >> to parse and aggregate the output.
> >>
> >> Please also share the script you create for that purpose on your
> >> github repository.
> >>
> >> The second task is to pick one warning that appears often and improve
> >> checkpatch.pl to handle that better and get it accepted by the kernel
> >> community.
> >>
> >> Hints to the second task follow when the first task is solved.
> >>
> >> If you fail on any of those tasks, you are out of the selection process.
> >>
> >> Lukas
> >>
> >> Sir,
> >>
> >> I have attempted the task 1 and pushed the same to GitHub.
> >>
> >> Please have a look and suggest improvements.
> >>
> >> https://github.com/eldraco19/evalute_improve_checkpatch_pl
> >>
> >> Please let me know if there are any issues with this.
> >>
> >> So far, so good.
> >>
> >> Here are the questions we want to answer:
> >>
> >> - So what are the 20 categories that occur most?
> >>
> >> You are getting close to that answer, but you are not there yet.
> >>
> >> Then look at the findings. For those 20 categories, are there specific
> >> findings that are multiple times false positives?
> >>
> >> So, the script complains about something, but it does not get that the
> >> patch author wrote something completely unrelated to the error message.
> >>
> >> Lukas
> >>
> >> Sir,
> >>
> >> I tried the given tasks and it can be found here,
> >>
> >> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/STATS.md
> >>
> >> The solution is implemented a bit complicated, but well, at least, it
> >> works if I believe your report. (I only read the code, but did not run
> >> it.)
> >>
> >> The goal now is to find a class of false positives and improve
> >> checkpatch.pl accordingly.
> >>
> >> I suggest that you look at the specific DIFF_IN_COMMIT_MSG reported
> >> errors?
> >>
> >> Provide a short assessment for each DIFF_IN_COMMIT_MSG error in the
> >> 10 commits.
> >>
> >> It should tell:
> >> - what lines in the commit message did checkpatch.pl complain about?
> >> - what is the pattern in the commit message?
> >> - does patch(1) really stumble over that pattern?
> >> - how would this pattern need to be provided to patch(1) so that it
> >> would stumble over it?
> >> - if no, why not?
> >> - can we change checkpatch.pl to not raise an error for such a
> >> situation? So, only raise an error when the pattern would really make
> >> patch stumble on it?
> >>
> >> Depending on the evaluation, we might continue to improve checkpatch.pl
> >> for reporting this error type, or we decide to look at GIT_COMMIT_ID
> >> errors, where I can quickly spot some false positives.
> >>
> >> Best regards,
> >>
> >> Lukas
> >>
> >> Sir,
> >>
> >> I analysed the given error type and my analysis can be found here:
> >>
> >> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/DIFF_IN_COMMIT_MSG.md
> >
> > Evaluation looks sound. Although, I cannot really see the analysis of all
> > 10 commits. You say the 10 commits fall into two classes, but how can
> > anyone else judge this from your report?
> >
> > I also do not fully understand your conclusion; to me, it seems to
> > contradict itself. Fortunately, I think your analysis suggests that there
> > is not a clear improvement to checkpatch.pl, as far as I see.
> >
> > So, I do not think that this is a good starting point for a change of
> > checkpatch.pl.
> >
> > I suggest that you look at the error type GIT_COMMIT_ID. I have found some
> > cases that seem to be suitable for improvement of the checkpatch.pl
> > script.
> >
> > Lukas
>
> Sir,
>
> I have been looking for more improvements in checkpatch.pl, especially with GIT_COMMIT_ID.
>
> I found that commits which mentioned "revert commits" in their description will get error
> even if they follow the proper syntax.
>
> for example: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20200903&id=e8a170ff9a3576730e43c0dbdd27b7cd3dc56848
>
> In this example,
> commit description has,
>
> commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"")
>
> which is correct as per the syntax, but checkpatch still gives an error.
>
> So, I tried to fix this by:
>
> ---
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 149518d2a6a7..01df2b9b2236 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2874,6 +2874,9 @@ sub process {
>                                 $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
>                                 $orig_desc .= " " . $1;
>                                 $hasparens = 1;
> +                       } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(Revert "[^"]+[^"]")"\)/i) {
> +                               $orig_desc = $1;
> +                               $hasparens = 1;
>                         }
>
>                         ($id, $description) = git_commit_info($orig_commit,
> ---
> (on linux next-20200903)
>

First, your email client is broken. Just to let you know. My client is
broken, too :) I did not bother now to pull out my working email
client...

Generally, you identified the right place to add this extended check
and the check makes sense.

> This patch fixes the issues with commits of the similar type given in the above example but some cases like
>
> - commit 1234567890ab ("Revert
> "foo bar"")
>
> - commit 1234567890ab ("Revert "foo
> bar"")
>
> - commit 1234567890ab
> ("Revert "foo bar"")
>
>
> basically the cases where next-line comes are not handled. But there can a lot of different patterns where next-line is coming, so do we add a separate if condition for all the patterns? or we just continue giving
> an error in case of next-line?
>

I think you should extend this whole check to work properly with line
breaks. You can see how this is implemented currently in checkpatch.pl
just a few lines above, right?

Can you provide a full list of checkpatch.pl findings in v5.4..v5.8
where all the different commit 1234567890ab ("Revert "..."") variants
in the commit message appear?

Then, we can use those commits directly as test cases for your extension?

Best regards,

Lukas


More information about the Linux-kernel-mentees mailing list