[Bugme-new] [Bug 8962] New: sky2: network intermittently unavailable after ifdown/ifup under load

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu Aug 30 11:22:40 PDT 2007


http://bugzilla.kernel.org/show_bug.cgi?id=8962

           Summary: sky2: network intermittently unavailable after
                    ifdown/ifup under load
           Product: Networking
           Version: 2.5
     KernelVersion: 2.6.23-rc4
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: IPV4
        AssignedTo: shemminger at osdl.org
        ReportedBy: gbailey at lxpro.com


Most recent kernel where this bug did not occur:  Unknown

Distribution:  CentOS 4.5

Hardware Environment:  Intel server board SE7320VP21

02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8050 PCI-E ASF
Gigabit Ethernet Controller (rev 18)
        Subsystem: Intel Corporation: Unknown device 3466
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at deefc000 (64-bit, non-prefetchable) [size=16K]
        I/O ports at b800 [size=256]
        Expansion ROM at deec0000 [disabled] [size=128K]
        Capabilities: [48] Power Management version 2
        Capabilities: [50] Vital Product Data
        Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/1
Enable-
        Capabilities: [e0] Express Legacy Endpoint IRQ 0

Software Environment:  CentOS 4.5 install + "vanilla" kernel 2.6.23-rc4

Problem Description:

Discovered while attempting to troubleshoot:
https://bugzilla.redhat.com/show_bug.cgi?id=228733

I'm trying to understand the "tx timeout" messages, and how to reproduce them. 
In my test environment, I have 2 servers, each of which has a sky2 Marvell NIC
connected to a switch as "eth0".

On server "B", I type "nc -l -p 3409 > /dev/null"

On server "A", I type "nc server-B 3409 < /dev/zero"

I see lots of traffic from A->B, as would be expected.  If I shutdown eth0 on
server "B" using "ifdown eth0", wait a few seconds, and then re-enable eth0 on
server "B" using "ifup eth0", I see the following in "dmesg" output on server
B:

sky2 eth0: disabling interface
sky2 eth0: enabling interface
sky2 eth0: ram buffer 48K
sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control rx
ip_tables: (C) 2000-2006 Netfilter Core Team

As expected...  The problem is that server B can occasionally end up in a state
where it is unable to ping or access the local subnet anymore.  Both "mii-tool"
and "ethtool eth0" shows a link present.

If I perform "ifdown eth0; ifup eth0" on server B, it doesn't help anything. 
If I unload the sky2 module, then things clear up and I'm back on the network
again.

I'm curious about this testcase because the symptom seems to match the earlier
"tx timeout" messages; the driver tried to re-enable itself after a timeout,
but it's still not able to see any traffic.

Steps to reproduce:

See "Problem Description" above.  While traffic is continuously being
transmitted from server "A" to server "B", shutdown the network interface on
server "B", and then start the interface on server "B".  Monitoring RX traffic
on server "B" will indicate when it is no longer receiving the bytes sent from
server "A".


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list