[Bugme-new] [Bug 11315] New: Memory errors and device failures in IPW2200

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Tue Aug 12 22:03:00 PDT 2008


http://bugzilla.kernel.org/show_bug.cgi?id=11315

           Summary: Memory errors and device failures in IPW2200
           Product: Drivers
           Version: 2.5
     KernelVersion: 2.6.25.15
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: network-wireless
        AssignedTo: drivers_network-wireless at kernel-bugs.osdl.org
        ReportedBy: aa at quick.cz


Latest working kernel version:

Dunno.

Earliest failing kernel version:

2.6.25.15

Distribution:

Archlinux

Hardware Environment:

Asus M2400N, Pentium M 1.6 GHz, 768 MB RAM, Intel IPW2915

Software Environment:

X.org, KDE... Nothing special.

Problem Description:

ipw2200/0: page allocation failure. order:3, mode:0x4020
Pid: 7450, comm: ipw2200/0 Not tainted 2.6.25.15-AP #1
 [<c015f289>] __alloc_pages+0x2b9/0x380
 [<c017b61c>] __slab_alloc+0x2fc/0x610
 [<c017b72a>] __slab_alloc+0x40a/0x610
 [<f038d8fb>] ipw_rx_queue_replenish+0x5b/0x100 [ipw2200]
 [<c017c708>] __kmalloc_track_caller+0xb8/0x100
 [<f038d8fb>] ipw_rx_queue_replenish+0x5b/0x100 [ipw2200]
 [<c0313e85>] __alloc_skb+0x55/0x120
 [<f038d8fb>] ipw_rx_queue_replenish+0x5b/0x100 [ipw2200]
 [<f038f4d0>] ipw_bg_rx_queue_replenish+0x0/0x40 [ipw2200]
 [<f038f4f6>] ipw_bg_rx_queue_replenish+0x26/0x40 [ipw2200]
 [<c012ebbc>] run_workqueue+0x6c/0x150
 [<c012f08f>] worker_thread+0x7f/0xe0
 [<c01324e0>] autoremove_wake_function+0x0/0x50
 [<c012f010>] worker_thread+0x0/0xe0
 [<c0132097>] kthread+0x37/0x70
 [<c0132060>] kthread+0x0/0x70
 [<c0104d13>] kernel_thread_helper+0x7/0x14
 =======================
Mem-info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:   3
Active:132552 inactive:16871 dirty:9 writeback:1 unstable:0
 free:1692 slab:7344 mapped:13890 pagetables:956 bounce:0
DMA free:2992kB min:72kB low:88kB high:108kB active:1988kB inactive:4176kB
present:16256kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 737 737
Normal free:3776kB min:3436kB low:4292kB high:5152kB active:528220kB
inactive:63308kB present:755144kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 36*4kB 12*8kB 12*16kB 10*32kB 13*64kB 11*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 2992kB
Normal: 778*4kB 31*8kB 4*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 3776kB
54636 total pagecache pages
Swap cache: add 171402, delete 147872, find 34791/42583
Free swap  = 1630068kB
Total swap = 2000084kB
Free swap:       1630068kB
194368 pages of RAM
0 pages of HIGHMEM
2728 reserved pages
111884 pages shared
23530 pages swap cached
9 pages dirty
1 pages writeback
13890 pages mapped
7344 pages slab
956 pages pagetables

Steps to reproduce:

Attach the card to an AP and run *huge* data transfers in *both* directions.
Hundreds of these long messages appear in dmesg. Some of them end like this:

ipw2200: Firmware error detected.  Restarting.
ipw2200: Unable to load firmware: -12
ipw2200: Unable to load firmware: -12
ipw2200: Failed to up device

A total network failure follows. Re-modprobing the module can fix it. However,
the firmware is a binary blackbox, AFAIK... So the memory allocation failure
should be more interesting than the strange firmware failure.

Some error messages (very rarely) reported allocation failures in other
processes too, but I'd guess there's less than 1% of them. One of those
processes was hald-addon-input.

(BTW, I had a *similar* problem on a dual CPU server (IBM xSeries 330) with an
Atheros WiFi card, with lots of strange page failures in vital kernel
processes. (Dmesg messages suggested a reboot, which I did.) But the server
uses a tainted kernel (reiquiring ath_hal and ath_pci to serve as an AP), so I
didn't post the messages here.)


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list