[Linux-kernel-mentees] Regarding "Linux Kernel: Evaluate and Improve checkpatch.pl"
Ayush
ayush at disroot.org
Sun Sep 6 09:59:48 UTC 2020
August 31, 2020 10:44 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
> On Sun, 30 Aug 2020, Ayush wrote:
>
>> August 27, 2020 10:59 AM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
>>
>> On Mon, 24 Aug 2020, Ayush wrote:
>>
>> August 22, 2020 1:36 PM, "Lukas Bulwahn" <lukas.bulwahn at gmail.com> wrote:
>>
>> On Fri, 21 Aug 2020, Ayush wrote:
>>
>> Hints to the first task:
>>
>> Can you create a list of all non-merge commits that were added in the
>> version v5.8 of the kernel, i.e., all non-merge commits that are in v5.8
>> and not already in v5.7?
>>
>> Can you share the script/command you executed and the resulting list on
>> github?
>>
>> Can you run your script on all commits of this list above and record
>> all checkpatch.pl reports, and store them in your github repository?
>>
>> Can you suggest ideas how to aggregate the findings and create a
>> statistics? For example: Which type of error is reported most?
>> Can you implement that idea?
>>
>> I also suggest to have a look at
>> the options ./scripts/checkpatch.pl --list-types and
>> ./scripts/checkpatch.pl --show-types. The option --show-types changes
>> the output of checkpatch.pl to list type identifiers, so it is easier
>> to parse and aggregate the output.
>>
>> Please also share the script you create for that purpose on your
>> github repository.
>>
>> The second task is to pick one warning that appears often and improve
>> checkpatch.pl to handle that better and get it accepted by the kernel
>> community.
>>
>> Hints to the second task follow when the first task is solved.
>>
>> If you fail on any of those tasks, you are out of the selection process.
>>
>> Lukas
>>
>> Sir,
>>
>> I have attempted the task 1 and pushed the same to GitHub.
>>
>> Please have a look and suggest improvements.
>>
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl
>>
>> Please let me know if there are any issues with this.
>>
>> So far, so good.
>>
>> Here are the questions we want to answer:
>>
>> - So what are the 20 categories that occur most?
>>
>> You are getting close to that answer, but you are not there yet.
>>
>> Then look at the findings. For those 20 categories, are there specific
>> findings that are multiple times false positives?
>>
>> So, the script complains about something, but it does not get that the
>> patch author wrote something completely unrelated to the error message.
>>
>> Lukas
>>
>> Sir,
>>
>> I tried the given tasks and it can be found here,
>>
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/STATS.md
>>
>> The solution is implemented a bit complicated, but well, at least, it
>> works if I believe your report. (I only read the code, but did not run
>> it.)
>>
>> The goal now is to find a class of false positives and improve
>> checkpatch.pl accordingly.
>>
>> I suggest that you look at the specific DIFF_IN_COMMIT_MSG reported
>> errors?
>>
>> Provide a short assessment for each DIFF_IN_COMMIT_MSG error in the
>> 10 commits.
>>
>> It should tell:
>> - what lines in the commit message did checkpatch.pl complain about?
>> - what is the pattern in the commit message?
>> - does patch(1) really stumble over that pattern?
>> - how would this pattern need to be provided to patch(1) so that it
>> would stumble over it?
>> - if no, why not?
>> - can we change checkpatch.pl to not raise an error for such a
>> situation? So, only raise an error when the pattern would really make
>> patch stumble on it?
>>
>> Depending on the evaluation, we might continue to improve checkpatch.pl
>> for reporting this error type, or we decide to look at GIT_COMMIT_ID
>> errors, where I can quickly spot some false positives.
>>
>> Best regards,
>>
>> Lukas
>>
>> Sir,
>>
>> I analysed the given error type and my analysis can be found here:
>>
>> https://github.com/eldraco19/evalute_improve_checkpatch_pl/blob/master/DIFF_IN_COMMIT_MSG.md
>
> Evaluation looks sound. Although, I cannot really see the analysis of all
> 10 commits. You say the 10 commits fall into two classes, but how can
> anyone else judge this from your report?
>
> I also do not fully understand your conclusion; to me, it seems to
> contradict itself. Fortunately, I think your analysis suggests that there
> is not a clear improvement to checkpatch.pl, as far as I see.
>
> So, I do not think that this is a good starting point for a change of
> checkpatch.pl.
>
> I suggest that you look at the error type GIT_COMMIT_ID. I have found some
> cases that seem to be suitable for improvement of the checkpatch.pl
> script.
>
> Lukas
Sir,
I have been looking for more improvements in checkpatch.pl, especially with GIT_COMMIT_ID.
I found that commits which mentioned "revert commits" in their description will get error
even if they follow the proper syntax.
for example: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20200903&id=e8a170ff9a3576730e43c0dbdd27b7cd3dc56848
In this example,
commit description has,
commit 193392ed9f69 ("Revert "drm/amd/display: add -msse2 to prevent Clang from emitting libcalls to undefined SW FP routines"")
which is correct as per the syntax, but checkpatch still gives an error.
So, I tried to fix this by:
---
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 149518d2a6a7..01df2b9b2236 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2874,6 +2874,9 @@ sub process {
$rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
$orig_desc .= " " . $1;
$hasparens = 1;
+ } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(Revert "[^"]+[^"]")"\)/i) {
+ $orig_desc = $1;
+ $hasparens = 1;
}
($id, $description) = git_commit_info($orig_commit,
---
(on linux next-20200903)
This patch fixes the issues with commits of the similar type given in the above example but some cases like
- commit 1234567890ab ("Revert
"foo bar"")
- commit 1234567890ab ("Revert "foo
bar"")
- commit 1234567890ab
("Revert "foo bar"")
basically the cases where next-line comes are not handled. But there can a lot of different patterns where next-line is coming, so do we add a separate if condition for all the patterns? or we just continue giving
an error in case of next-line?
Please look into this.
Thanks,
Ayush
More information about the Linux-kernel-mentees
mailing list