[llvmlinux] [GSoC] Integrating the Clang static analyzer: first (rough) proposal draft
e.bachmakov at gmail.com
Fri May 3 05:05:23 UTC 2013
Thanks for your input, JS! Here's v2. What exactly did you mean by
"sparse is a start (replicate functionality, extend functionality)"
(If the table is hard to read for anyone, here's a rendered version:
Static analysis detects semantic errors which are usually hard to find: either
some test fails or the issue appears under specific conditions no test was
designed for. Or, even worse, there is no crash but a subtle issue like
garbage values. Implementing static analysis allows skipping much of the
debugging involved and fix the issue right away or at least think through why
the analyzer would return a false-positive, saving time and sleep.
Especially in the case of the Linux kernel correctness (or at least
predictability) is important. Running on millions of devices of all shapes and
sizes (think TOP500 to Android) and having such a rapid pace of development, a
method for dealing with those 20% of the code that take 80% of the time would
be invaluable and have a significant impact on the quality of code released.
clang-analyzer (checker) is one such static analyzer and fits nicely within
the llvmlinux project.
In order to get a pleasurable developer experience, multiple steps are
* Integrating checker into the build system
* Simple checks are already easy to do by stetting $C and $CHECK variables.
However, most of the time more context than offending line is necessary
(e.g. null pointer dereference), which is why `scan-build` provides much of
the necessary context.
* Using `scan-build` within the build system is non-trivial. Integration of a
target, e.g. `make analysis` would be the first goal.
* Integration with buildbot
* Instead of capturing simple stdio, the idea is to extend the buildbot
associate each build with the relevant analysis report. This way would e.g.
allow interested kernel developer who do not want to go through the trouble
of setting up their own build system see any/all issues with their code
* Create aggregate statistics tool
* What goes wrong most?
* Who's code is breaking (... checker)?
* Choosing relevant existing checks
* checker already has a sizable list of available checks
http://clang-analyzer.llvm.org/available_checks.html , and not all are
relevant for linux. The goal is to find a reasonable default
* Modifying existing checks
* Some of the checks don't necessarily work as intended. "Undefined or
garbage value returned to caller" is distracting if the variable was
created using a macro that explicitly states so.
* Add new checks
* The existing checks are by no means exhaustive. With a project as big as
linux, there should be plenty of bugs available to derive new checks from.
Jun-1,2: Familiarize myself with llvmlinux build system,
design of `scan-build`, buildbot, general other
Jun-3,4: Integrate `scan-build` into build system as either
first-class citizen or completely transparently
Jun-5: Integrate checking functionality into buildbot
Jul-1,2: Implement summaries into buildbot and (optionally)
the build system.
Jul-3,4: Analyze the results of checker, determine which
are legitimate and which are false-postives and/or
unapplicable, disable the latter.
Aug-2: Investigate the former. Report (tons of?) bugs.
Aug-3,4,5: Fix checks that are fixable. Depending on
circumstance, design system to dynamically enable
or disable checks. Jump to next point if done
Sep-1,2: Implement new checks for kernel/assembly related
Sep-3: Buffer week (polish, additional documentation, etc)
More information about the LLVMLinux