<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 11, 2015 at 7:57 AM, Jan-Simon Moeller <span dir="ltr"><<a href="mailto:dl9pf@gmx.de" target="_blank">dl9pf@gmx.de</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Some progress:<br>
<br>
I bisected from 4.1 to 4.2-rc1 and found this.<br>
Sounds scary with clang in mind. This really assumes gcc ;).<br>
<br>
@David: what do you think? Can clang deal with it ?<br>
<br>
<br>
be6cb02779ca74d83481f017db21578cfe92891c is the first bad commit<br>
commit be6cb02779ca74d83481f017db21578cfe92891c<br>
Author: Ingo Molnar <<a href="mailto:mingo@kernel.org">mingo@kernel.org</a>><br>
Date: Fri Apr 10 14:08:46 2015 +0200<br>
<br></blockquote><div><br></div><div>Thank you Jan, Woodhouse and everyone, tested with the latest reversal patch and it worked:<br><br>commit 8cb2f092ee1fbfe17cb9c58cd3636ff60a74d88b<br>Author: Jan-Simon Möller <<a href="mailto:dl9pf@gmx.de">dl9pf@gmx.de</a>><br>Date: Tue Aug 11 02:20:33 2015 +0200<br><br> Temporarily revert this patch until we discussed a solution.<br> <br> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
x86: Align jump targets to 1-byte boundaries<br>
<br>
The following NOP in a hot function caught my attention:<br>
<br>
> 5a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)<br>
<br>
That's a dead NOP that bloats the function a bit, added for the<br>
default 16-byte alignment that GCC applies for jump targets.<br>
<br>
I realize that x86 CPU manufacturers recommend 16-byte jump<br>
target alignments (it's in the Intel optimization manual),<br>
to help their relatively narrow decoder prefetch alignment<br>
and uop cache constraints, but the cost of that is very<br>
significant:<br>
<br>
text data bss dec filename<br>
12566391 1617840 1089536 15273767 vmlinux.align.16-<br>
byte<br>
12224951 1617840 1089536 14932327 vmlinux.align.1-<br>
byte<br>
<br>
By using 1-byte jump target alignment (i.e. no alignment at all)<br>
we get an almost 3% reduction in kernel size (!) - and a<br>
probably similar reduction in I$ footprint.<br>
[...]<br>
<br>
<br>
Best,<br>
Jan-Simon<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
<br>
Am Montag, 10. August 2015, 18:57:29 schrieb Jan-Simon Moeller:<br>
> So what I can see in your a2llog is that it fails somewhere between<br>
> init/main smp_setup_processor_id<br>
> and the<br>
> pr_notice after page_address_init .<br>
> It points to memory init imho - there were a lot of small changes in the<br>
> latest cycle (and ASM changes, too).<br>
><br>
><br>
> What I see on my log is is similar ...<br>
> page_address_init ~ setup_arch ~ then arch/x86/kernel/setup.c:898<br>
> setup.c:898 is a printk actually ...<br>
> early_idt_handler_array[i] ~> early_idt_handler_common<br>
><br>
> then<br>
> early_idt_handler_common at arch/x86/kernel/head_64.S:397<br>
> dump_stack at lib/dump_stack.c:27<br>
><br>
> dump stack is already the stacktrace.<br>
><br>
> So somewhere in arch/x86/kernel/setup.c or arch/x86/kernel/head_64.S<br>
><br>
> commit cdeb6048940fa4bfb429e2f1cba0d28a11e20cd5<br>
> Author: Andy Lutomirski <<a href="mailto:luto@kernel.org">luto@kernel.org</a>><br>
> Date: Fri May 22 16:15:47 2015 -0700<br>
> x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers<br>
><br>
> maybe ?<br>
><br>
><br>
> Best,<br>
> JS<br>
><br>
> Am Montag, 10. August 2015, 23:26:59 schrieb Peter Teoh:<br>
> > Thank you Jan and David,<br>
> ><br>
> > here are the a2l.log.gz file as attachment. Not sure if it make sense to<br>
> > you?<br>
> ><br>
> > On Mon, Aug 10, 2015 at 10:53 PM, Jan-Simon Moeller <<a href="mailto:dl9pf@gmx.de">dl9pf@gmx.de</a>> wrote:<br>
> > > Yes, its easier to look at -d in_asm.<br>
> > ><br>
> > > For that reason (and to not forget the commands ;) ) I added to<br>
> > > targets/x86_64 the make goal<br>
> > ><br>
> > > "make test-qemu-debug" ... Then take a look at a2l.log .<br>
> > ><br>
> > ><br>
> > > It will generate a few files:<br>
> > > - qemulog.log is the full dose of -din_asm,op,int,exec,cpu,cpu_reset,<br>
> > > - debugaddr.log has just the mem addr grep'ed out<br>
> > > - addresses.log is the last 2000 of these w/o the rest of the line<br>
> > > - a2l.log is the output of address2line for each of the lines in<br>
> > > addresses.log<br>
> > ><br>
> > ><br>
> > > In theory it should point to the last functions executed and print out<br>
> > > the function name/line right in the llvmlinux kernel.<br>
> > ><br>
> > ><br>
> > > Still some grep'ing remains in case there're a lot of prints (e.g.<br>
> > > stacktrace).<br>
> > ><br>
> > > Possibly limit the amount of data (just limit to -din_asm) in the<br>
> > > makefile.<br>
> > ><br>
> > > Just compiling now ...<br>
> > ><br>
> > > Best,<br>
> > > Jan-Simon<br>
> > ><br>
> > > Am Montag, 10. August 2015, 09:12:18 schrieb David Woodhouse:<br>
> > > > On Sat, 2015-08-08 at 09:33 +0800, Peter Teoh wrote:<br>
> > > > > On Sat, Aug 8, 2015 at 8:24 AM, Jan-Simon Moeller <<a href="mailto:dl9pf@gmx.de">dl9pf@gmx.de</a>><br>
> > ><br>
> > > wrote:<br>
> > > > > > This is probably due to a lockup in early boot stages (16bit boot<br>
> > > > > > code).<br>
> > > ><br>
> > > > I believe I did fix all of that once, except for the clang bug where<br>
> > ><br>
> > > it<br>
> > ><br>
> > > > doesn't honour -mregparm=3 for calls to intrinsics like memcpy:<br>
> > > > <a href="https://llvm.org/bugs/show_bug.cgi?id=3997" rel="noreferrer" target="_blank">https://llvm.org/bugs/show_bug.cgi?id=3997</a><br>
> > > ><br>
> > > > But I'd assume llvmlinux is still carrying the patch which avoids the<br>
> > > > issue with an explicit call to its memcpy function instead of just<br>
> > > > doing a struct assignment and letting LLVM turn it into a memcpy?<br>
> > > ><br>
> > > > Perhaps another such issue has arisen, though?<br>
> > > ><br>
> > > > > so is there any way to do debugging through "-s -S" option?<br>
> > > ><br>
> > > > Debugging 16-bit code with gdb was relatively painful. A lot of the<br>
> > > > time it's easier just to run it with -d in_asm and read what happened.<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature">Regards,<br>Peter Teoh</div>
</div></div>