[Linux-kernel-mentees] Linux kernel checkpatch.pl mentorship

Lukas Bulwahn lukas.bulwahn at gmail.com
Sat Sep 12 11:03:00 UTC 2020



On Sat, 12 Sep 2020, Dwaipayan Ray wrote:

> 
> 
> On Fri, Sep 11, 2020 at 12:59 PM Lukas Bulwahn <lukas.bulwahn at gmail.com> wrote:
>       Dear Dwaipayan,
> 
> 
>       The zeroth task is to learn suitable netiquette for the communication with
>       the kernel community.
> 
>       First, please do not top-post.
> 
>           A: Because we read from top to bottom, left to right.
>           Q: Why should I start my reply below the quoted text?
> 
>           A: Because it messes up the order in which people normally read text.
>           Q: Why is top-posting such a bad thing?
> 
>           A: The lost context.
>           Q: What makes top-posted replies harder to read than bottom-posted?
> 
>           A: Yes.
>           Q: Should I trim down the quoted part of an email to which I'm
>       replying?
> 
>       Second, please always CC: linux-kernel-mentees at lists.linuxfoundation.org.
> 
>       Third, set up your email client according to the kernel community rules.
> 
> 
>       Then, the first task is to run checkpatch.pl on a few kernel patches and
>       collect the results. When you have that, please share your script with
>       me, e.g., in a github repository.
> 
> 
>       Hints to the first task:
> 
>       Can you create a list of all non-merge commits that were added in the
>       version v5.8 of the kernel, i.e., all non-merge commits that are in v5.8
>       and not already in v5.7?
> 
>       Can you share the script/command you executed and the resulting list on
>       github?
> 
>       Can you run your script on all commits of this list above and record
>       all checkpatch.pl reports, and store them in your github repository?
> 
>       Can you suggest ideas how to aggregate the findings and create a
>       statistics? For example: Which type of error is reported most?
>       Can you implement that idea?
> 
> 
>       I also suggest to have a look at the options ./scripts/checkpatch.pl
>       --list-types and ./scripts/checkpatch.pl --show-types. The option
>       --show-types changes the output of checkpatch.pl to list type identifiers,
>       so it is easier to parse and aggregate the output.
> 
>       Please also share the script you create for that purpose on your
>       github repository.
> 
> 
>       The second task is to pick one warning that appears often and improve
>       checkpatch.pl to handle that better and get it accepted by the kernel
>       community.
> 
>       Hints to the second task follow when the first task is solved.
> 
>       If you fail on any of those tasks, you are out of the selection process.
> 
>       I could implement that with just a few lines of code changes, but please
>       do not underestimate the learning curve here. I hope you are very fit in
>       Perl, that is required for this project.
> 
> 
>       Lukas
> 
> 
> 
> Hello Sir,
> I have gone through the zeroth task and I am aware of the mailing rules now. 
> Also I have implemented the first task and I would like you to review it.
>

Hmm, your email client still seems to be broken :( If you answer to my 
email, it should use ">" not tabs. Maybe you can fix that.

> The task was to run checkpatch.pl on some commits and aggregate the reports. 
> 
> The first subtask was to collect the non merge commits between versions 5.7 and 5.8. 
> I aggregated the commit hashes and author names into a single file using git's log
> and pretty format directives.
> 
> The next subtask was to run checkpatch.pl on all the given commits.
> I wrote a perl script to this effect which reads the commits stored in the 
> file and runs the checkpatch.pl script with the commit hash. Also I used the 
> --show-types directives in this stage which allowed me easier collection of
> warning and error identifiers.
> 
> The final subtask was to aggregate and parse the data. Looking at the checkpatch's output 
> format, I determined it was enough to parse only the first two tokens from each line.
> I calculated three possibilities:  "WARNING:{warning_identifier}", "ERROR:{error_identifier}",
> "Commit {commit_abbrev_hash}", and aggregated these values.
> 
> Finally I used them to find total commits read, total errors, total warnings,
> and the most frequent warnings and errors from checkpatch's output. 
> 
> I have uploaded the scripts and output files to https://github.com/raydwaipayan/lkm-task-1
>

I looked at your scripts, I did not run them. They look as if they would 
do the job you claim they do. They are more complicated than needed, but 
it was not the task to find a simple solution. So, let us try them.

Please have a look at this patch:

https://lore.kernel.org/linux-kernel-mentees/20200912094826.150170-1-ayush@disroot.org/

The author states:

This issue was discovered through a thorough analysis of checkpatch.pl
errors and warnings of type GIT_COMMIT_ID on commits between v5.7 and 
v5.8.

Before applying this patch, checkpatch.pl reported 342 errors of type
GIT_COMMIT_ID. After applying patch, errors reduced to 284.


If your scripts work, you should be able to confirm the statement.

The tasks are:

1. Run your scripts and create a full statistics of all error types with 
their according count for v5.7..v5.8.

2. Apply the patch with git am.

3. Run your scripts again and create a new statistics.

4. Compare before and after

5. Make all results available on your github repository.


Lukas


More information about the Linux-kernel-mentees mailing list