[Linux-kernel-mentees] Investigating ./scripts/kernel-doc for third phase of mentorship

Lukas Bulwahn lukas.bulwahn at gmail.com
Sun Feb 14 09:20:21 UTC 2021


On Sat, Feb 13, 2021 at 10:46 AM Aditya <yashsri421 at gmail.com> wrote:
>
> On 11/2/21 4:46 pm, Aditya wrote:
> > On 11/2/21 4:13 pm, Lukas Bulwahn wrote:
> >> On Thu, Feb 11, 2021 at 9:02 AM Aditya <yashsri421 at gmail.com> wrote:
> >>>
> >>> On 9/2/21 10:31 pm, Aditya wrote:
> >>>> On 9/2/21 9:43 am, Lukas Bulwahn wrote:
> >>>>> Hi Aditya,
> >>>>>
> >>>>>
> >>>>> I have been using ./scripts/kernel-doc lately and noticed a few issues
> >>>>> with this script.
> >>>>> For the third phase, I expect you to investigate the issues on
> >>>>> ./scripts/kernel-doc below, describe the issue to the kernel-doc
> >>>>> maintainer, and work on proper support for this parser in kernel-doc.
> >>>>>
> >>>>> When running kernel-doc on these files below, you should observe these errors:
> >>>>>
> >>>>> ./scripts/kernel-doc -none kernel/gcov/gcc_4_7.c
> >>>>>
> >>>>> Use of uninitialized value $param in regexp compilation at
> >>>>> ./scripts/kernel-doc line 1559, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $actual in substitution (s///) at
> >>>>> ./scripts/kernel-doc line 1523, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $actual in substitution (s///) at
> >>>>> ./scripts/kernel-doc line 1523, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in substitution (s///) at
> >>>>> ./scripts/kernel-doc line 1617, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in hash element at
> >>>>> ./scripts/kernel-doc line 1651, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in pattern match (m//) at
> >>>>> ./scripts/kernel-doc line 1651, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in hash element at
> >>>>> ./scripts/kernel-doc line 1652, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in pattern match (m//) at
> >>>>> ./scripts/kernel-doc line 1654, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in concatenation (.) or string at
> >>>>> ./scripts/kernel-doc line 1655, <IN_FILE> line 95.
> >>>>> Use of uninitialized value $param in hash element at
> >>>>> ./scripts/kernel-doc line 1672, <IN_FILE> line 95.
> >>>>>
> >>>>>
> >>>>> ./scripts/kernel-doc -none include/linux/zstd.h
> >>>>>
> >>>>> include/linux/zstd.h:153: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:170: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:180: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:230: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:273: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:365: error: Cannot parse struct or union!
> >>>>>
> >>>>> include/linux/zstd.h:421: error: Cannot parse struct or union!
> >>>>>
> >>>>> include/linux/zstd.h:537: error: Cannot parse struct or union!
> >>>>> include/linux/zstd.h:682: error: Cannot parse struct or union!
> >>>>>
> >>>>> include/linux/zstd.h:935: error: Cannot parse struct or union!
> >>>>>
> >>>>> Can you reproduce these issues?
> >>>>>>>
> >>>>> Further, I would see that you can create two new features for kernel-doc:
> >>>>>
> >>>>> - Support for warning types and configuration of build warning flags,
> >>>>> i.e., with -W command-line arguments and output as known for compilers
> >>>>> - Support of default arguments
> >>>>>
> >>>>> Let us start with investigation of those issues in the files, though.
> >>>>> Then, we see how complicated it is to resolve those and discuss that
> >>>>> with the maintainer.
> >>>>>
> >>>>>
> >>>>
> >>>> Alright Lukas. Sounds interesting. I'll get back to you with my findings.
> >>>>
> >>>
> >>> Hi Lukas
> >>>
> >>> I was able to reproduce the issues. In my investigation I found the
> >>> following to be the cause of this:
> >>>
> >>> - For scripts/kernel-doc -none kernel/gcov/gcc_4_7.c :-
> >>> Cause:
> >>> struct gcov_info is a structure in this file, which has a parameter:
> >>> void (*merge[GCOV_COUNTERS])(gcov_type *, unsigned int)
> >>>
> >>> This argument goes to the elsif condition for argument parsing "elsif
> >>> ($arg =~ m/\(.+\)\s*\(/)"
> >>> But the regex match at Line 1556 ($arg =~
> >>> m/[^\(]+\(\*?\s*([\w\.]*)\s*\)/;) is not able to capture the $param.
> >>>
> >>> In my investigation, I found that it was expected to capture
> >>> "*merge[GCOV_COUNTERS]" as $param. But the regex does not take into
> >>> account the presence of square brackets("["."]"). As a result, the
> >>> $param capture does not happen as expected.
> >>> As $param remains undefined in this case, it causes the above error to
> >>> occur.
> >>>
> >>> - For scripts/kernel-doc -none include/linux/zstd.h :-
> >>> Cause: Certain struct definitions present in the file do not follow
> >>> the expected format for parsing.
> >>> For eg. for "struct ZSTD_compressionParameters", it is defined with
> >>> the help of typedef syntax, ie
> >>> typedef struct {
> >>>         unsigned int windowLog;
> >>>         unsigned int chainLog;
> >>>         unsigned int hashLog;
> >>>         unsigned int searchLog;
> >>>         unsigned int searchLength;
> >>>         unsigned int targetLength;
> >>>         ZSTD_strategy strategy;
> >>> } ZSTD_compressionParameters;
> >>>
> >>> Here, the definition might be compiler error free, but for kernel-doc
> >>> parsing, the definition was probably supposed to go like:
> >>>
> >>> typedef struct ZSTD_compressionParameters {
> >>>         unsigned int windowLog;
> >>>         unsigned int chainLog;
> >>>         unsigned int hashLog;
> >>>         unsigned int searchLog;
> >>>         unsigned int searchLength;
> >>>         unsigned int targetLength;
> >>>         ZSTD_strategy strategy;
> >>> } ZSTD_compressionParameters;
> >>>
> >>> OR
> >>>
> >>> struct ZSTD_compressionParameters {
> >>>         unsigned int windowLog;
> >>>         unsigned int chainLog;
> >>>         unsigned int hashLog;
> >>>         unsigned int searchLog;
> >>>         unsigned int searchLength;
> >>>         unsigned int targetLength;
> >>>         ZSTD_strategy strategy;
> >>> };
> >>>
> >>> The parser probably expects struct struct_name to occur together,
> >>> causing the above error.
> >>>
> >>
> >> Good initial analysis. Do you want to extend those patterns, such that
> >> these definitions are properly parsed as well?
> >>
> >> Then, we should check if we observe any other difference (caused any
> >> regression) running kernel-doc on the complete tree.
> >>
>
> Hi Lukas
>
> I generated the report by running kernel-doc on complete tree.
> The report in itself is quite large (~20k lines). So, I shortened it
> in no-warning(error_report.txt) and less-common
> warnings(reduced_report.txt) versions.
>
> No-warning version(error_report.txt):
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/kernel-doc/error_report.txt
>
> Less common warning version:
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/kernel-doc/reduced_report.txt
>
> Complete report:
> https://github.com/AdityaSrivast/kernel-tasks/blob/master/random/kernel-doc/report.txt
>
> For the Uninitialized $param error, apart from kernel/gcov/gcc_4_7.c
> file, drivers/net/wireless/broadcom/brcm80211/brcmfmac/fweh.h also
> emits the same error with the script. The reason is same as gcc_4_7.c
> file though, ie presence of square brackets in the params and regex
> not taking into account the same.
>
> Similarly, there is a total of 67 errors for "Cannot parse *".
> 58 of these are for "error: Cannot parse struct or union!". Rest are
> for enum.
>
> In addition, I observed that if data variables are mentioned in a
> comma separated fashion ie @var1, var2, var3 in comments(as they have
> common description), they may not be considered described.
>
> For eg. running scripts/kernel-doc -none include/linux/mhi.h, we get:
> include/linux/mhi.h:451: warning: Function parameter or member 'M0'
> not described in 'mhi_controller'
> include/linux/mhi.h:451: warning: Function parameter or member 'M2'
> not described in 'mhi_controller'
> include/linux/mhi.h:451: warning: Function parameter or member 'M3'
> not described in 'mhi_controller'
>
> However, these are described in comments as:
> * @M0, M2, M3: Counters to track number of device MHI state changes
>
> and present in struct definition as:
> u32 M0, M2, M3;
>
> I think that comma separated members are allowed in definition, but
> maybe they are not yet supported in comment descriptions, hence
> causing the error. But as the variables are described already in
> comments, maybe this is an example of false positive.
>
> Similarly there may be more such cases for false positives for
> warnings, but it was taking a lot of time to analyze all of them, so I
> decided to take feedback on the findings till now.
>
> What do you think Lukas?
>

I suggest to start to extend the parser to handle those Uninitialized
$param errors. Then look into parsing those structs.

Once kernel-doc can handle those patterns, we can start thinking about
extending the kernel-doc capabilities, e.g., support comma-separated
argument lists, support default argument definitions within a file,
support ignoring/skipping  some definitions between kernel-doc
description and the code (which otherwise cause warnings), etc., etc.

Please reach out with your report to the kernel-doc maintainer,
Jonathan Corbet, and to linux-doc, and explain what you want to do.

Then get started on writing the patches for the parser extension, if
you have not already started anyway.

As soon as you have something to test, send around those patches as
RFC and let everyone know.

Lukas


More information about the Linux-kernel-mentees mailing list