[llvmlinux] [GSoC] Integrating the Clang static analyzer: first (rough) proposal draft

Eduard Bachmakov e.bachmakov at gmail.com
Thu May 2 18:51:22 UTC 2013


This is my first write-up. This would be part of "My Project" in the actual
proposal. The schedule will follow once the scope is a bit more pinned down.

Questions/comments/concerns?

Why bother?
=========
Static analysis detects semantic errors which are usually hard to find:
either some test fails or the issue appears under specific conditions no
test was designed for. Or, even worse, there is no crash but a subtle issue
like garbage values. Implementing static analysis allows skipping much of
the debugging involved and fix the issue right away or at least think
through why the analyzer would return a false-positive, saving time and
sleep.

clang-analyzer (checker) is one such static analyzer and fits nicely within
the llvmlinux project.

What?
=====
In order to get a pleasurable developer experience, multiple steps are
necessary:

   - Integrating checker into the build system
   - Simple checks are already easy to do by stetting $C and $CHECK
      variables. However, most of the time more context than offending line is
      necessary (e.g. null pointer dereference), which is why `scan-build`
      provides much of the necessary context.
      - Using `scan-build` within the build system is non-trivial.
      Integration of a target, e.g. `make analysis` would be the first goal
   - Integration with buildbot
      - Instead of capturing simple stdio, the idea is to extend the
      buildbot associate each build with the relevant analysis report. This way
      would e.g. allow interested kernel developer who do not want to
go through
      the trouble of setting up their own build system see any/all issues with
      their code (per target)
   - Choosing relevant existing checks
      - checker already has a sizable list of available checks,
      http://clang-analyzer.llvm.org/available_checks.html , and not all
      are relevant for linux. The goal is to find a reasonable default
   - Modifying existing checks
      - Some of the checks don't necessarily work as intended. "Undefined
      or garbage value returned to caller" is disctracting if the variable was
      created using a macro that explicitly states so.
   - Add new checks
      - The existing checks are by no means exhaustive. With a project as
      big as linux, there should be plenty of bugs available to derive
new checks
      from. [I know this is handwavy, hoping for input]
   - Create aggregate statistics tool
      - What goes wrong most?
      - Who's code is breaking (... checker)?
      - etc.
   - What else?
      - @LLVMLINUX folks, what would you like to see happen?


When/How?
=========
TBD.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/llvmlinux/attachments/20130502/6449c019/attachment.html>


More information about the LLVMLinux mailing list