[Linux-kernel-mentees] Regarding "Linux Kernel: Evaluate and Improve checkpatch.pl"

Ayush ayush at disroot.org
Sun Sep 6 09:59:48 UTC 2020


August 31, 2020 10:44 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:

> On Sun, 30 Aug 2020, Ayush wrote:
> 
>> August 27, 2020 10:59 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
>> 
>> On Mon, 24 Aug 2020, Ayush wrote:
>> 
>> August 22, 2020 1:36 PM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
>> 
>> On Fri, 21 Aug 2020, Ayush wrote:
>> 
>> Hints to the first task:
>> 
>> Can you create a list of all non-merge commits that were added in the
>> version v5.8 of the kernel, i.e., all non-merge commits that are in v5.8
>> and not already in v5.7?
>> 
>> Can you share the script/command you executed and the resulting list on
>> github?
>> 
>> Can you run your script on all commits of this list above and record
>> all checkpatch.pl reports, and store them in your github repository?
>> 
>> Can you suggest ideas how to aggregate the findings and create a
>> statistics? For example: Which type of error is reported most?
>> Can you implement that idea?
>> 
>> I also suggest to have a look at
>> the options ./scripts/checkpatch.pl --list-types and
>> ./scripts/checkpatch.pl --show-types. The option --show-types changes
>> the output of checkpatch.pl to list type identifiers, so it is easier
>> to parse and aggregate the output.
>> 
>> Please also share the script you create for that purpose on your
>> github repository.
>> 
>> The second task is to pick one warning that appears often and improve
>> checkpatch.pl to handle that better and get it accepted by the kernel
>> community.
>> 
>> Hints to the second task follow when the first task is solved.
>> 
>> If you fail on any of those tasks, you are out of the selection process.
>> 
>> Lukas
>> 
>> Sir,
>> 
>> I have attempted the task 1 and pushed the same to GitHub.
>> 
>> Please have a look and suggest improvements.
>> 
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl
>> 
>> Please let me know if there are any issues with this.
>> 
>> So far, so good.
>> 
>> Here are the questions we want to answer:
>> 
>> - So what are the 20 categories that occur most?
>> 
>> You are getting close to that answer, but you are not there yet.
>> 
>> Then look at the findings. For those 20 categories, are there specific
>> findings that are multiple times false positives?
>> 
>> So, the script complains about something, but it does not get that the
>> patch author wrote something completely unrelated to the error message.
>> 
>> Lukas
>> 
>> Sir,
>> 
>> I tried the given tasks and it can be found here,
>> 
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/STATS.md
>> 
>> The solution is implemented a bit complicated, but well, at least, it
>> works if I believe your report. (I only read the code, but did not run
>> it.)
>> 
>> The goal now is to find a class of false positives and improve
>> checkpatch.pl accordingly.
>> 
>> I suggest that you look at the specific DIFF_IN_COMMIT_MSG reported
>> errors?
>> 
>> Provide a short assessment for each DIFF_IN_COMMIT_MSG error in the
>> 10 commits.
>> 
>> It should tell:
>> - what lines in the commit message did checkpatch.pl complain about?
>> - what is the pattern in the commit message?
>> - does patch(1) really stumble over that pattern?
>> - how would this pattern need to be provided to patch(1) so that it
>> would stumble over it?
>> - if no, why not?
>> - can we change checkpatch.pl to not raise an error for such a
>> situation? So, only raise an error when the pattern would really make
>> patch stumble on it?
>> 
>> Depending on the evaluation, we might continue to improve checkpatch.pl
>> for reporting this error type, or we decide to look at GIT_COMMIT_ID
>> errors, where I can quickly spot some false positives.
>> 
>> Best regards,
>> 
>> Lukas
>> 
>> Sir,
>> 
>> I analysed the given error type and my analysis can be found here:
>> 
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/DIFF_IN_COMMIT_MSG.md
> 
> Evaluation looks sound. Although, I cannot really see the analysis of all
> 10 commits. You say the 10 commits fall into two classes, but how can
> anyone else judge this from your report?
> 
> I also do not fully understand your conclusion; to me, it seems to
> contradict itself. Fortunately, I think your analysis suggests that there
> is not a clear improvement to checkpatch.pl, as far as I see.
> 
> So, I do not think that this is a good starting point for a change of
> checkpatch.pl.
> 
> I suggest that you look at the error type GIT_COMMIT_ID. I have found some
> cases that seem to be suitable for improvement of the checkpatch.pl
> script.
> 
> Lukas

Sir,

I have been looking for more improvements in checkpatch.pl, especially with GIT_COMMIT_ID.

I found that commits which mentioned "revert commits" in their description will get error
even if they follow the proper syntax.

for example: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20200903&id=e8a170ff9a3576730e43c0dbdd27b7cd3dc56848

In this example,
commit description has,

commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"")

which is correct as per the syntax, but checkpatch still gives an error.

So, I tried to fix this by:

---
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 149518d2a6a7..01df2b9b2236 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2874,6 +2874,9 @@ sub process {
                                $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
                                $orig_desc .= " " . $1;
                                $hasparens = 1;
+                       } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(Revert "[^"]+[^"]")"\)/i) {
+                               $orig_desc = $1;
+                               $hasparens = 1;
                        }
 
                        ($id, $description) = git_commit_info($orig_commit,
---
(on linux next-20200903)

This patch fixes the issues with commits of the similar type given in the above example but some cases like

- commit 1234567890ab ("Revert
"foo bar"")

- commit 1234567890ab ("Revert "foo
bar"")

- commit 1234567890ab
("Revert "foo bar"")


basically the cases where next-line comes are not handled. But there can a lot of different patterns where next-line is coming, so do we add a separate if condition for all the patterns? or we just continue giving 
an error in case of next-line?

Please look into this.


Thanks,
Ayush


More information about the Linux-kernel-mentees mailing list