[llvmlinux] Heavy experiments

Wed Jul 24 01:08:37 UTC 2013

Hi Renato,

On Tue, Jul 23, 2013 at 5:08 PM, Renato Golin <renato.golin at linaro.org>wrote:

> On 24 July 2013 00:11, Marcelo Sousa <marceloabsousa at gmail.com> wrote:
>
>> First of all, I want to understand if you're stating that analyzing the
>> Linux Kernel through LLVM IR is not a good approach or analyzing LLVM IR in
>> general is no good.
>>
>
> The latter.
>

I dont' agree and in fact it feels to me that there is almost a new field
in software verification over LLVM IR. Just to start, there are several
examples of bounded model checkers, symbolic execution procedures that
operate over LLVM IR. Are you somehow implying that this analysis is no
good?

>
>
>
>>  I do not understand what you mean by "the AST is gone on the IR level".
>> I can argue that in a naive compilation the IR code is structurally to the
>> original C code, in fact, the SSA transformation is "functionalizing".
>>
>
> That's not entirely true. All the IR needs to do is to "behave" like the
> original code, but most of the semantics is gone for good.
>

You need to define "behave" to talk about semantics. Are you saying that
bisimulation is not related with semantics?

Lets go through your examples and please do enumerate all cases because I'm
very interested in bundling them.

>
> Examples:
>  - Sign information is on instructions, not types
>
This is not a semantic problem, because you can refine the LLVM IR type
system. In fact, that is something I am doing.

>  - Structures are flattened out, or cast to arrays
>
The structural information is still there otherwise it's not a sound
compilation. I need to see real examples of C code where the original
structures cannot be retrieved anymore. Moreover, a sound flattening (which
I still need to understand in the particular case of LLVM) is probably a
more reliable model for certain types of verification where it is important
to know that there is a flatenning happening.

>  - Multiple structures are commoned up if they're structurally equivalent
>
Not sure what the problem is here. If LLVM IR can identify that two structs
are structurally equivalent for free, that seems good for me. Are you
saying that it will give wrong type annotations to values that have the
same structural type? If that's the case, it's clearly something that
should/can be fixed.

>  - Function calls are changed (ByVal, ABI specific, architecture specific)
>
Not sure why is this a problem also for verification. In fact, again I get
a much more reliable execution model.

>  - Intrinsics are expanded into IR code
>
 - Perfectly valid code is shrunk into intrinsics (optimizing, legalizing,
> atomic ops, pragmas)
>
Looking at some IR of the kernel I'm not sure if there is heavy use
of Intrinsics. Nevertheless, this is one criticism for LLVM IR, that it's
not suppose to have any intrinsics at all but be able to generate
appropriate optimized assembly code through patterns. Note that I'm being a
somewhat purist here, i.e. LLVM is supposed to be Virtual Machine. There
are almost more intrinsics than instructions, and that's not good for any
perspective.

>  - Vectorization can explode the amount of IR beyond recognition
>
Are you referring to vectorization as a compiler transformation? Please
elaborate and if possible provide an example of this. Nevertheless, I doubt
that this happens in the particular case of the Linux Kernel.

>  - Inlining shuffles code and will hide any analysis you do on the inlinee
>
I fail to see this point if your analysis is sound, i.e. your analysis
results of the inlinee when analyzed independently are more general than
the ones when you analyze the function with the inlining. This may affect
precision of the inlinee independently but the analysis is still precise in
the inlining.
For example, in a type-based analysis this can result that the inlinee
would have a polymorphic type and the function that has the body of the
inlinee is just a specialization.

>  - PHI nodes, basic blocks for loops and switches can be recognized, but
> they don't mean the same things anymore
>
So what do they mean? Sure there is a small-step operational semantics of
phi instruction which does not mean that there is a loop, but if you can
identify a pattern that is a loop. However, note that i'm not claiming that
there is a bijection between a particular C program and the LLVM IR code
but I do claim a bisimulation relation with C + Arch and LLVM IR + Arch',
where Arch' is a subset of Arch, and they really complete the
*real/concrete* execution model with the information about the
architecture.

>  - Target-specific lowering does destroy the IR, making it impossible to
> be portable, sometimes even to sub-architectures
>
Can you get me some concrete examples of this? Even if that's the case,
this is useful for software verification that considers architectural
information. This is exactly what I'm trying to do with the Linux Kernel at
the moment: i'm just trying to show that for x86 certain behaviours that
are arch-specific due to inline assembly instructions do not occur.

>  - C++ gets flattened out to C in the Clang AST, and then to
> target-specific in IR, it's virtually impossible to attach any meaning to it
>
If that's the case, what is the argument to claim the "capability of
representing ‘all’ high-level languages cleanly"? I agree with you than
it's an hard task to have a back-end to C++ but semantically is not
"virtually impossible".
 - SCEV, DCE, CSE and undefined behaviour will *also* change IR beyond
recognition
I don't recognize SCEV, DCE, CSE. About undefined behaviour, are you
talking about *undef*? We should really have a discussion just about the
semantics of *undef*, or write a paper about "Defining undef". Again, I
agree that is possible that transformations *can* possibly change IR which
makes lifting to a similar C program w.r.t the original very hard, but
*sound* transformations will not change its behaviour - semantic
preservation.
Moreover, note that by performing analysis at the IR level you are not only
able to identify potential real bugs in your "high-level" program but you
also are doing compiler certification. Understanding the origin of the
error might be hard but I find the analysis particularly valuable.
 - IR has no knowledge of the pre-compiler syntax
I'm not sure of why this is relevant for verification. Can you provide an
example?

>
> I could go on forever, but you get the idea. IR is NOT a representation of
> your code. At best, it's a high-level assembly, and that's what it was
> designed for, and for that, it does a pretty good job.
>

I need to be further convinced that an Intermediate Representation is NOT a
Representation of the code.

>
>
> Not entirely sure what you mean in this paragraph. I believe that the sort
>> of information that you loose if because LLVM IR has design faults, not
>> necessarily because of transformation to the LLVM IR. Perhaps you can
>> elaborate on what sort of information you loose besides the annoying
>> implicit unsignedness that is relevant for verification and the fact that
>> it may be harder to identify higher-abstraction constructs like for-loops.
>>
>
> Not design flaws, design. It wasn't designed to represent any particular
> language and it won't be a 1-to-1 match to any of them.
>
>
>  Can you provide a reference to this work? At this point, I'm really not
>> sure what you mean by "this kind of experiments".
>>
>
> Unfortunately not, it was an internal pet project. Basically, the idea was
> to add pass flags to LLC via a Python script that would discard the bad
> results (can't compile, can't run) and cross the good results, optimizing
> for performance. It was a simple experiment, but shown with great clarity
> the interaction between the most used (and most powerful) passes.
>
>
> Surely I can apply certain levels of analysis (intra-procedural,
>> inter-procedural and even inter-modular) to verify components of the
>> kernel. The hard problem is how to verify several components in a
>> concurrent setting.
>>
>
> Not to mention the sheer size of the kernel, that will make any small task
> a pain.
>
I have a lightweight type analysis that I believe is scalable to verify the
kernel in a couple of minutes. I can verify modules within a second.

>
> Another question: Is one of the goals of the google summer project to
>> apply the clang-analyzer to several versions of the kernel or just the
>> latest one?
>>
>
> I don't know. But AFAIK, the Clang Analyser is not done in IR, but in the
> Clang AST (where it should be, because of the problems I mentioned). So,
> it's quite possible that you don't need to generate IR at all.
>
> Any semantic analysis must be done at the Clang's AST, which is quite rich
> and powerful. IR is most suitable for language/target-independent
> transformations.
>

I don't know the details of Clang's AST but I need to be convinced. In
fact, I know of examples where an analysis at the LLVM IR can catch errors
(semantics errors of the original program) that currently clang-analyse
does not.

Thanks for this discussion.

Regards,
Marcelo

>
> Hope that clears up a bit... ;)
>
> cheers,
> --renato
>
> _______________________________________________
> LLVMLinux mailing list
> LLVMLinux at lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/llvmlinux
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/llvmlinux/attachments/20130723/7b50202f/attachment-0001.html>