[lxc-devel] segfault on shutdown if containers running

Adam Majer adamm at zombino.com
Wed Jul 1 21:29:45 PDT 2009

Sukadev Bhattiprolu wrote:
> Daniel Lezcano [dlezcano at fr.ibm.com] wrote:
>> Adam Majer wrote:
>>>>>>> chrdev_open + 0x148/0x167
>>>>>>> chrdev_open + 0x0/0x167
>>>>>>> __dentry_open + 0x148/0x260
>>>>>>> do_flip_open + 0x468/0x85a
>>>>>>> alloc_fd + ....
>>>>>>> do_sys_gen + ...
>>>>>>> system_call_fastpath + ....
>>>>>>> RIP tty_open
>>>>>> which kernel are you using ? could you run :
>>>>>>     $ addr2line -e <vmlinux> ffffffff803b26f1
>>>>> Would setting this to yes produce a backtrace with line numbers?
>>>> Yes and maybe CONFIG_FRAME_POINTER too.
>>> Well, adding symbols didn't add line numbers to the backtrace. But now I
>>> can use addr2line with the vmlinux (all 78mb of it) to get you guys a
>>> backtrace with line numbers :)
>>> So,
>>> OOPS (NULL pointer deference) at
>>>   drivers/char/tty_io.c:1321  (tty_open)
>>> Backtrace,
>>>   fs/char_dev.c:397 (chrdev_open)
>>>   fs/char_dev.c:357 (chrdev_open)
>>>   fs/open.c:841     (__dentry_open)
>>>   arch/x86/include/asm/atomic_64.h:117 (do_flip_open)
>>>   fs/file.c:459     (alloc_fd)
>>>   ...
>> Excellent !
>> Suka ? Isn't this oops related to the newpts instance ?
> What version of the kernel are you running ? 
> Could it be this bug: http://lkml.org/lkml/2009/1/26/274
> It was fixed by following commit and should be in 2.6.29.
> 	commit 808ffa3d302257b9dc37b1412c1fcdf976fcddac
> 	Author: Eric Paris <eparis at redhat.com>
> 	Date:   Tue Jan 27 11:50:37 2009 +0000
> Sukadev

It's a Debian stock 2.6.30-1 kernel, so the 2.6.30. The original OOPS
was on the AMD64, but I've also just had a very similar OOPS in a 686
(32-bit) version of the same kernel.

The original oops (above) occurred when I did,

  lxc-start -n container &
  poweroff (in root system)

segfault occurred each time after the prompt "attempting to kill all
processes...." then oops then a "failed" from the cleanup process. If
the container is shut down prior to poweroff, no oops.

The following is all a 686 2.6.30 stock debian kernel,

Another OOPS (same function, slightly different backtrace),

  1. ssh to root machine
  2. lxc-start -n container &
  3. lxc-console -n container -t 1 &
  4. close ssh on originating machine
  5. re-ssh into the machine

at this point I had a console into the container and to the root machine
(where I ssh in) that were intermixed. Then the segfault occurred with
the following backtrace. Debugging info not compiled into it, but it is
similar symbols,

  IP: tty_open + 0x1a5

Again, I had a prompt from within the container and from the parent
machine when I re-ssh into the parent machine. So some ttys got intermixed.

I've also managed to generate a hard lockup of the kernel from within
the container, but that seems to be related to networking. The process
for that was,

  1. bridge + veth interface
  2. start container and networking up from within container
  3. ssh into container
  4. ifdown in cantainer
  5. at this point the ssh session remained in spite of container
networking down!!
  6. ifup container (new IP)
  7. hard lockup occurred with the backtrace scrolled way off screen

- Adam

PS. Running lxc-checkconfig on the Debian kernel yields,

Found kernel config file /boot/config-2.6.30-1-686
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled

--- Control groups ---
Cgroup: enabled
Cgroup namespace: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: disabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled
Macvlan: enabled

I can provide you with the complete kernel config if you want it.

More information about the Containers mailing list