[Bugme-new] [Bug 9180] New: Null pointer dereference in workqueue processing in kblockd process on switching I/O scheduler

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu Oct 18 03:58:20 PDT 2007


http://bugzilla.kernel.org/show_bug.cgi?id=9180

           Summary: Null pointer dereference in workqueue processing in
                    kblockd process on switching I/O scheduler
           Product: Process Management
           Version: 2.5
     KernelVersion: vanilla 2.6.23 (and likely 2.6.23.1)
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
        AssignedTo: process_other at kernel-bugs.osdl.org
        ReportedBy: gentuu at gmail.com


Most recent kernel where this bug did not occur: 2.6.22.x
Distribution: Gentoo
Hardware Environment: i386 and x86_64
Software Environment: GNU
Problem Description:

An oops occurs on switching I/O scheduler during active fork()ing
and disk usage.

Here is x86_64 ARCH dmesg part:

[  620.257633] ------------[ cut here ]------------
[  620.257641] kernel BUG at kernel/workqueue.c:258!
[  620.257643] invalid opcode: 0000 [1] SMP
[  620.257645] CPU 0
[  620.257647] Modules linked in: tcp_westwood ipt_REJECT xt_state
iptable_filter ipt_REDIRECT ipt_owner xt_tcpudp xt_multiport iptable_nat nf_nat
nf_conntrack_ipv4 nf_conntrack ip_tables x_tables sdhci mmc_core sr_mod cdrom
[  620.257662] Pid: 77, comm: kblockd/0 Not tainted 2.6.23 #19
[  620.257664] RIP: 0010:[<ffffffff8024a4f6>]  [<ffffffff8024a4f6>]
run_workqueue+0x116/0x170
[  620.257672] RSP: 0018:ffff81003f4a3ea0  EFLAGS: 00010282
[  620.257673] RAX: ffff81002fe67880 RBX: 0000000000000000 RCX:
ffff81002fe67880
[  620.257676] RDX: ffff81002fe67880 RSI: ffff81003f4a3ed0 RDI:
ffff81002fe67878
[  620.257677] RBP: ffffffff8031bd10 R08: ffff81003f4a2000 R09:
ffff81003e31cdc0
[  620.257679] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff81003f675b40
[  620.257681] R13: ffff81003f675b48 R14: 0000000000000000 R15:
0000000000000000
[  620.257684] FS:  0000000000000000(0000) GS:ffffffff806a4000(0000)
knlGS:0000000000000000
[  620.257686] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  620.257688] CR2: 00002b68aabb41de CR3: 000000002eb0a000 CR4:
00000000000006e0
[  620.257690] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  620.257692] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  620.257694] Process kblockd/0 (pid: 77, threadinfo ffff81003f4a2000, task
ffff81003f6b8000)
[  620.257696] Stack:  ffffffff8024b040 ffff81003f675b58 ffff81003f675b40
ffffffff8024b040
[  620.257700]  ffff81003f675b48 ffffffff8024b0e3 0000000000000000
ffff81003f6b8000
[  620.257703]  ffffffff8024e900 ffff81003f4a3ee8 ffff81003f4a3ee8
00000000fffffffc
[  620.257706] Call Trace:
[  620.257709]  [<ffffffff8024b040>] worker_thread+0x0/0x110
[  620.257712]  [<ffffffff8024b040>] worker_thread+0x0/0x110
[  620.257714]  [<ffffffff8024b0e3>] worker_thread+0xa3/0x110
[  620.257718]  [<ffffffff8024e900>] autoremove_wake_function+0x0/0x30
[  620.257721]  [<ffffffff8024b040>] worker_thread+0x0/0x110
[  620.257723]  [<ffffffff8024b040>] worker_thread+0x0/0x110
[  620.257726]  [<ffffffff8024e53b>] kthread+0x4b/0x80
[  620.257729]  [<ffffffff8020cab8>] child_rip+0xa/0x12
[  620.257732]  [<ffffffff8024e4f0>] kthread+0x0/0x80
[  620.257734]  [<ffffffff8020caae>] child_rip+0x0/0x12
[  620.257736]
[  620.257737]
[  620.257737] Code: 0f 0b eb fe 66 0f 1f 44 00 00 65 48 8b 34 25 00 00 00 00
8b
[  620.257746] RIP  [<ffffffff8024a4f6>] run_workqueue+0x116/0x170
[  620.257749]  RSP <ffff81003f4a3ea0>

the i386 oops looks like this one (can provide later).

Please note the problem is always occurs in "kblockd/X" process (X = 0 ~~ SMP
CPUs)


Steps to reproduce:
Need 2.6.23.1 with CFQ and deadline I/O schedulers loaded.

and these 3 simultaneous tasks trigger the problem:

1. something like this:
# while :; do
>        echo deadline > /sys/block/sda/queue/scheduler
>        echo cfq > /sys/block/sda/queue/scheduler
>done

2. a fork()/clone() torture (possible, you can apply any other way - should not
matter).
I'm running a simple perl script:
------------CUT----------
#!/usr/bin/perl

$SIG{CHLD}=IGNORE;
while (1) {
        $r=fork;
        exit if (defined($r) && $r == 0);
}
------------/CUT----------
it's not a fork bomb. It's forking in 1 thread. Children are just exiting.

3. some high disk usage. Like this:
# while :; do tar xjf ~/linux-2.6.23.1.tar.bz2; rm -rf linux-2.6.23.1; done


It could take some time to occur, like 10-30 minutes.

Almost always system is fully hanging when this oops occurs.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list