[lxc-devel] segfault on shutdown if containers running
adamm at zombino.com
Wed Jul 1 21:29:45 PDT 2009
Sukadev Bhattiprolu wrote:
> Daniel Lezcano [dlezcano at fr.ibm.com] wrote:
>> Adam Majer wrote:
>>>>>>> chrdev_open + 0x148/0x167
>>>>>>> chrdev_open + 0x0/0x167
>>>>>>> __dentry_open + 0x148/0x260
>>>>>>> do_flip_open + 0x468/0x85a
>>>>>>> alloc_fd + ....
>>>>>>> do_sys_gen + ...
>>>>>>> system_call_fastpath + ....
>>>>>>> RIP tty_open
>>>>>> which kernel are you using ? could you run :
>>>>>> $ addr2line -e <vmlinux> ffffffff803b26f1
>>>>> Would setting this to yes produce a backtrace with line numbers?
>>>> Yes and maybe CONFIG_FRAME_POINTER too.
>>> Well, adding symbols didn't add line numbers to the backtrace. But now I
>>> can use addr2line with the vmlinux (all 78mb of it) to get you guys a
>>> backtrace with line numbers :)
>>> OOPS (NULL pointer deference) at
>>> drivers/char/tty_io.c:1321 (tty_open)
>>> fs/char_dev.c:397 (chrdev_open)
>>> fs/char_dev.c:357 (chrdev_open)
>>> fs/open.c:841 (__dentry_open)
>>> arch/x86/include/asm/atomic_64.h:117 (do_flip_open)
>>> fs/file.c:459 (alloc_fd)
>> Excellent !
>> Suka ? Isn't this oops related to the newpts instance ?
> What version of the kernel are you running ?
> Could it be this bug: http://lkml.org/lkml/2009/1/26/274
> It was fixed by following commit and should be in 2.6.29.
> commit 808ffa3d302257b9dc37b1412c1fcdf976fcddac
> Author: Eric Paris <eparis at redhat.com>
> Date: Tue Jan 27 11:50:37 2009 +0000
It's a Debian stock 2.6.30-1 kernel, so the 2.6.30. The original OOPS
was on the AMD64, but I've also just had a very similar OOPS in a 686
(32-bit) version of the same kernel.
The original oops (above) occurred when I did,
lxc-start -n container &
poweroff (in root system)
segfault occurred each time after the prompt "attempting to kill all
processes...." then oops then a "failed" from the cleanup process. If
the container is shut down prior to poweroff, no oops.
The following is all a 686 2.6.30 stock debian kernel,
Another OOPS (same function, slightly different backtrace),
1. ssh to root machine
2. lxc-start -n container &
3. lxc-console -n container -t 1 &
4. close ssh on originating machine
5. re-ssh into the machine
at this point I had a console into the container and to the root machine
(where I ssh in) that were intermixed. Then the segfault occurred with
the following backtrace. Debugging info not compiled into it, but it is
IP: tty_open + 0x1a5
Again, I had a prompt from within the container and from the parent
machine when I re-ssh into the parent machine. So some ttys got intermixed.
I've also managed to generate a hard lockup of the kernel from within
the container, but that seems to be related to networking. The process
for that was,
1. bridge + veth interface
2. start container and networking up from within container
3. ssh into container
4. ifdown in cantainer
5. at this point the ssh session remained in spite of container
6. ifup container (new IP)
7. hard lockup occurred with the backtrace scrolled way off screen
PS. Running lxc-checkconfig on the Debian kernel yields,
Found kernel config file /boot/config-2.6.30-1-686
--- Namespaces ---
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled
--- Control groups ---
Cgroup namespace: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: disabled
Cgroup cpuset: enabled
--- Misc ---
Veth pair device: enabled
I can provide you with the complete kernel config if you want it.
More information about the Containers