[llvmlinux] Heavy experiments

Marcelo Sousa marceloabsousa at gmail.com
Tue Jul 23 23:11:18 UTC 2013

Hi Renato,

On Tue, Jul 23, 2013 at 3:35 PM, Renato Golin <renato.golin at linaro.org>wrote:

> On 23 July 2013 23:21, Marcelo Sousa <marceloabsousa at gmail.com> wrote:
>> Compare the IR in what sense? In terms of AST's comparison? At the
>> moment, I just want to perform analysis over the IR of several versions of
>> the kernel.
> What kind of analysis? The AST is gone on the IR level, but you could
> compare the number and strength of instructions, divided by their stride,
> if vector instructions, etc. Another way would be to compile the changed
> IRs into executable and run micro-benchmarks, but the kernel is not good
> for that.

First of all, I want to understand if you're stating that analyzing the
Linux Kernel through LLVM IR is not a good approach or analyzing LLVM IR in
general is no good.
By analysis, I am referring to somewhat standard static analysis techniques
such as common memory errors such as use after free, use out of bounds, or
reachability properties in general. Moreover, there is a second area of
analysis that is possible through the LLVM IR type system or an extension
to it which is what I am doing right now. In short, I have a similar
approach than CQual has for C. For me, it makes sense to do it at the LLVM
IR because I am interested in analyzing certain interactions between the
kernel and the architecture.

I do not understand what you mean by "the AST is gone on the IR level". I
can argue that in a naive compilation the IR code is structurally to the
original C code, in fact, the SSA transformation is "functionalizing".

> IR gets bloated quite quickly and (as you may have seen on a number of
> threads on the list), once you split the compilation between Clang and LLC,
> you lose a lot of information (that you shouldn't). It means that what you
> can actually do on the IR level (genetic algorithms to find the best set of
> passes, for instance) is greatly reduced by the fact that, each time, your
> chances of compiling to a final executable that actually makes sense, and
> can be compared to a vanilla Clang binary (with the same flags) tend
> towards zero.

Not entirely sure what you mean in this paragraph. I believe that the sort
of information that you loose if because LLVM IR has design faults, not
necessarily because of transformation to the LLVM IR. Perhaps you can
elaborate on what sort of information you loose besides the annoying
implicit unsignedness that is relevant for verification and the fact that
it may be harder to identify higher-abstraction constructs like for-loops.

> I've seen this kind of experiments (James did that at ARM once) and it
> wasn't easy, but it was just Dhrystone. I fear you'll spend more time
> trying to make the thing work in the first place and will give up mid-way
> to try something different, but I could be wrong.

Can you provide a reference to this work? At this point, I'm really not
sure what you mean by "this kind of experiments".

> Maybe what you could do is to find one specific kernel module (say a
> network or video driver) and do your experiments on it, with a standard
> LLVM-compiled kernel. Or even an independent sub-directory of the kernel
> (HID, or scheduling), but not the whole kernel.

Surely I can apply certain levels of analysis (intra-procedural,
inter-procedural and even inter-modular) to verify components of the
kernel. The hard problem is how to verify several components in a
concurrent setting.

Another question: Is one of the goals of the google summer project to apply
the clang-analyzer to several versions of the kernel or just the latest one?


> Makes sense?
> cheers,
> --renato
> _______________________________________________
> LLVMLinux mailing list
> LLVMLinux at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/llvmlinux
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/llvmlinux/attachments/20130723/54ed3423/attachment.html>

More information about the LLVMLinux mailing list