[llvmlinux] "make test" for x86_64 target just hung there, why?

Mon Aug 10 23:57:31 UTC 2015

Some progress:

I bisected from 4.1 to 4.2-rc1 and found this. 
Sounds scary with clang in mind. This really assumes gcc ;).

@David: what do you think? Can clang deal with it ?

be6cb02779ca74d83481f017db21578cfe92891c is the first bad commit
commit be6cb02779ca74d83481f017db21578cfe92891c
Author: Ingo Molnar <mingo at kernel.org>
Date:   Fri Apr 10 14:08:46 2015 +0200

    x86: Align jump targets to 1-byte boundaries

    The following NOP in a hot function caught my attention:

      >   5a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

    That's a dead NOP that bloats the function a bit, added for the
    default 16-byte alignment that GCC applies for jump targets.

    I realize that x86 CPU manufacturers recommend 16-byte jump
    target alignments (it's in the Intel optimization manual),
    to help their relatively narrow decoder prefetch alignment
    and uop cache constraints, but the cost of that is very
    significant:

            text           data       bss         dec      filename
        12566391        1617840   1089536    15273767      vmlinux.align.16-
byte
        12224951        1617840   1089536    14932327      vmlinux.align.1-
byte

    By using 1-byte jump target alignment (i.e. no alignment at all)
    we get an almost 3% reduction in kernel size (!) - and a
    probably similar reduction in I$ footprint.
[...]

Best,
Jan-Simon

Am Montag, 10. August 2015, 18:57:29 schrieb Jan-Simon Moeller:
> So what I can see in your a2llog is that it fails somewhere between
> init/main smp_setup_processor_id
> and the
> pr_notice after page_address_init .
> It points to memory init imho - there were a lot of small changes in the
> latest cycle (and ASM changes, too).
> 
> 
> What I see on my log is is similar ...
> page_address_init ~ setup_arch ~ then arch/x86/kernel/setup.c:898
> setup.c:898 is a printk actually ...
> early_idt_handler_array[i]  ~> early_idt_handler_common
> 
> then
> early_idt_handler_common at arch/x86/kernel/head_64.S:397
> dump_stack at lib/dump_stack.c:27
> 
> dump stack is already the stacktrace.
> 
> So somewhere in arch/x86/kernel/setup.c or arch/x86/kernel/head_64.S
> 
> commit cdeb6048940fa4bfb429e2f1cba0d28a11e20cd5
> Author: Andy Lutomirski <luto at kernel.org>
> Date:   Fri May 22 16:15:47 2015 -0700
>     x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
> 
> maybe ?
> 
> 
> Best,
> JS
> 
> Am Montag, 10. August 2015, 23:26:59 schrieb Peter Teoh:
> > Thank you Jan and David,
> > 
> > here are the a2l.log.gz file as attachment.   Not sure if it make sense to
> > you?
> > 
> > On Mon, Aug 10, 2015 at 10:53 PM, Jan-Simon Moeller <dl9pf at gmx.de> wrote:
> > > Yes, its easier to look at -d in_asm.
> > > 
> > > For that reason (and to not forget the commands ;) ) I added to
> > > targets/x86_64 the make goal
> > > 
> > > "make test-qemu-debug" ... Then take a look at a2l.log .
> > > 
> > > 
> > > It will generate a few files:
> > > - qemulog.log  is the full dose of -din_asm,op,int,exec,cpu,cpu_reset,
> > > - debugaddr.log  has just the mem addr grep'ed out
> > > - addresses.log is the last 2000 of these w/o the rest of the line
> > > - a2l.log  is the output of address2line for each of the lines in
> > > addresses.log
> > > 
> > > 
> > > In theory it should point to the last functions executed and print out
> > > the function name/line right in the llvmlinux kernel.
> > > 
> > > 
> > > Still some grep'ing remains in case there're a lot of prints (e.g.
> > > stacktrace).
> > > 
> > > Possibly limit the amount of data (just limit to -din_asm) in the
> > > makefile.
> > > 
> > > Just compiling now ...
> > > 
> > > Best,
> > > Jan-Simon
> > > 
> > > Am Montag, 10. August 2015, 09:12:18 schrieb David Woodhouse:
> > > > On Sat, 2015-08-08 at 09:33 +0800, Peter Teoh wrote:
> > > > > On Sat, Aug 8, 2015 at 8:24 AM, Jan-Simon Moeller <dl9pf at gmx.de>
> > > 
> > > wrote:
> > > > > > This is probably due to a lockup in early boot stages (16bit boot
> > > > > > code).
> > > > 
> > > > I believe I did fix all of that once, except for the clang bug where
> > > 
> > > it
> > > 
> > > > doesn't honour -mregparm=3 for calls to intrinsics like memcpy:
> > > > https://llvm.org/bugs/show_bug.cgi?id=3997
> > > > 
> > > > But I'd assume llvmlinux is still carrying the patch which avoids the
> > > > issue with an explicit call to its memcpy function instead of just
> > > > doing a struct assignment and letting LLVM turn it into a memcpy?
> > > > 
> > > > Perhaps another such issue has arisen, though?
> > > > 
> > > > > so is there any way to do debugging through "-s -S" option?
> > > > 
> > > > Debugging 16-bit code with gdb was relatively painful. A lot of the
> > > > time it's easier just to run it with -d in_asm and read what happened.