[Bugme-new] [Bug 12578] New: DMAR errors and driver instability

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu Jan 29 15:13:08 PST 2009


http://bugzilla.kernel.org/show_bug.cgi?id=12578

           Summary: DMAR errors and driver instability
           Product: Platform Specific/Hardware
           Version: 2.5
     KernelVersion: 2.6.29-rc3
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: x86-64
        AssignedTo: platform_x86_64 at kernel-bugs.osdl.org
        ReportedBy: adi at vmware.com


Latest working kernel version: unknown
Earliest failing kernel version: 2.6.28, earlier?
Distribution: Ubuntu 8.10
Hardware Environment: Dell Latitude E4300
Software Environment:
Problem Description: CONFIG_DMAR results in many "[DMA Write] Request ...
fault" messages, and failures in iwlagn and e1000e

Steps to reproduce: build 18e352e Linux 2.6.29-rc3 with CONFIG_DMAR=y, boot and
observe:

[    0.000000] Linux version 2.6.29-rc3-dmar (adi at philipl-e4300) (gcc version
4.3.2 (Ubuntu 4.3.2-1ubuntu11) ) #3 SMP Thu Jan 29 14:48:40 PST 2009
[    0.000000] Command line: root=/dev/sda5
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Centaur CentaurHauls
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009bc00 (usable)
[    0.000000]  BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000dd04d400 (usable)
[    0.000000]  BIOS-e820: 00000000dd04d400 - 00000000dd04f400 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000dd04f400 - 00000000e0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
[    0.000000]  BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
[    0.000000]  BIOS-e820: 00000000ffe60000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 000000011c000000 (usable)
[    0.000000] DMI 2.4 present.
[    0.000000] last_pfn = 0x11c000 max_arch_pfn = 0x100000000
[    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
[    0.000000] last_pfn = 0xdd04d max_arch_pfn = 0x100000000
[    0.000000] init_memory_mapping: 0000000000000000-00000000dd04d000
[    0.000000] last_map_addr: dd04d000 end: dd04d000
[    0.000000] init_memory_mapping: 0000000100000000-000000011c000000
[    0.000000] last_map_addr: 11c000000 end: 11c000000
[    0.000000] RAMDISK: 37bed000 - 37fefadd
[    0.000000] ACPI: RSDP 000FB9F0, 0024 (r2 DELL  )
[    0.000000] ACPI: XSDT DD051E00, 0074 (r1 DELL    M09     27D80A0A ASL      
 61)
[    0.000000] ACPI: FACP DD051C9C, 00F4 (r4 DELL    M09     27D80A0A ASL      
 61)
[    0.000000] ACPI Warning (tbfadt-0568): 32/64X length mismatch in Gpe0Block:
128/64 [20081204]
[    0.000000] FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN
(4)
[    0.000000] ACPI: DSDT DD052400, 64B2 (r2 INT430 SYSFexxx     1001 INTL
20050624)
[    0.000000] ACPI: FACS DD060C00, 0040
[    0.000000] ACPI: HPET DD051F00, 0038 (r1 DELL    M09            1 ASL      
 61)
[    0.000000] ACPI: DMAR DD060400, 0120 (r1 DELL    M09     27D80A0A ASL      
 61)
[    0.000000] ACPI: APIC DD052000, 0068 (r1 DELL    M09     27D80A0A ASL      
 47)
[    0.000000] ACPI: ASF! DD051C00, 006A (r32 DELL    M09     27D80A0A ASL     
  61)
[    0.000000] ACPI: MCFG DD051FC0, 003E (r16 DELL    M09     27D80A0A ASL     
  61)
[    0.000000] ACPI: SLIC DD05209C, 0176 (r1 DELL    M09     27D80A0A ASL      
 61)
[    0.000000] ACPI: TCPA DD052300, 0032 (r1                        0 ASL      
  0)
[    0.000000] ACPI: BOOT DD051BC0, 0028 (r1 DELL    M09     27D80A0A ASL      
 61)
[    0.000000] ACPI: SSDT DD050331, 066C (r1  PmRef    CpuPm     3000 INTL
20050624)

...

[    0.408471] DMAR:Host address width 36  
[    0.408575] DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed10000
[    0.408688] DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed11000
[    0.408799] DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed12000
[    0.408909] DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed13000
[    0.409020] DMAR:RMRR base: 0x00000000dd7e7000 end: 0x00000000dd7fffff
[    0.409129] DMAR:RMRR base: 0x00000000ddc00000 end: 0x00000000dfffffff
[    0.409296] IOMMU 0xfed12000: using Register based invalidation
[    0.409403] IOMMU 0xfed11000: using Register based invalidation
[    0.409510] IOMMU 0xfed10000: using Register based invalidation
[    0.409617] IOMMU 0xfed13000: using Register based invalidation
[    0.409724] IOMMU: Setting identity map for device 0000:00:02.0 [0xddc00000
- 0xe0000000]
[    0.409754] IOMMU: Setting identity map for device 0000:00:02.1 [0xddc00000
- 0xe0000000]
[    0.412569] IOMMU: Setting identity map for device 0000:00:1d.0 [0xdd7e7000
- 0xdd800000]
[    0.412763] IOMMU: Setting identity map for device 0000:00:1d.1 [0xdd7e7000
- 0xdd800000]
[    0.412953] IOMMU: Setting identity map for device 0000:00:1d.2 [0xdd7e7000
- 0xdd800000]
[    0.413146] IOMMU: Setting identity map for device 0000:00:1d.7 [0xdd7e7000
- 0xdd800000]
[    0.413336] IOMMU: Setting identity map for device 0000:00:1a.0 [0xdd7e7000
- 0xdd800000]
[    0.413526] IOMMU: Setting identity map for device 0000:00:1a.1 [0xdd7e7000
- 0xdd800000]
[    0.413716] IOMMU: Setting identity map for device 0000:00:1a.2 [0xdd7e7000
- 0xdd800000]
[    0.413906] IOMMU: Setting identity map for device 0000:00:1a.7 [0xdd7e7000
- 0xdd800000]
[    0.414099] IOMMU: gfx device 0000:00:02.0 1-1 mapping
[    0.414205] IOMMU: Setting identity map for device 0000:00:02.0 [0x0 -
0x9b000]
[    0.414423] IOMMU: Setting identity map for device 0000:00:02.0 [0x100000 -
0xdd04d000]
[    0.674461] IOMMU: Setting identity map for device 0000:00:02.0 [0x100000000
- 0x11c000000]
[    0.707494] IOMMU: gfx device 0000:00:02.1 1-1 mapping
[    0.707603] IOMMU: Setting identity map for device 0000:00:02.1 [0x0 -
0x9b000]
[    0.707823] IOMMU: Setting identity map for device 0000:00:02.1 [0x100000 -
0xdd04d000]
[    0.967825] IOMMU: Setting identity map for device 0000:00:02.1 [0x100000000
- 0x11c000000]
[    1.000894] IOMMU: Prepare 0-16M unity mapping for LPC
[    1.001004] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 -
0x1000000] 
[    1.036022] PCI-DMA: Intel(R) Virtualization Technology for Directed I/O
[    1.036155] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
[    1.036464] hpet0: 4 comparators, 64-bit 14.318180 MHz counter

...

[   19.386279] iwlagn 0000:0c:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[   19.386352] iwlagn 0000:0c:00.0: restoring config space at offset 0x1 (was
0x100102, writing 0x100106)
[   19.386440] iwlagn 0000:0c:00.0: irq 34 for MSI/MSI-X
[   19.386497] iwlagn 0000:0c:00.0: firmware: requesting iwlwifi-5000-1.ucode
[   19.440326] iwlagn loaded firmware version 5.4.1.16
[   19.457483] DMAR:[DMA Write] Request device [0c:00.0] fault addr ff9df000 
[   19.457484] DMAR:[fault reason 05] PTE Write access is not set
[   19.457583] DMAR:[DMA Write] Request device [0c:00.0] fault addr ff9dd000 
[   19.457584] DMAR:[fault reason 05] PTE Write access is not set
[   19.596507] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbe3000 
[   19.596509] DMAR:[fault reason 05] PTE Write access is not set
[   19.597114] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbe1000 
[   19.597115] DMAR:[fault reason 05] PTE Write access is not set
[   19.597191] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbe0000 
[   19.597192] DMAR:[fault reason 05] PTE Write access is not set
[   19.597269] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbdf000 
[   19.597270] DMAR:[fault reason 05] PTE Write access is not set
[   19.597341] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbde000 
[   19.597342] DMAR:[fault reason 05] PTE Write access is not set
[   19.597410] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbdc000 
[   19.597411] DMAR:[fault reason 05] PTE Write access is not set
[   19.599351] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbd9000 
[   19.599352] DMAR:[fault reason 05] PTE Write access is not set
[   19.599582] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbd8000 
[   19.599583] DMAR:[fault reason 05] PTE Write access is not set
[   19.599607] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbd7000 
[   19.599608] DMAR:[fault reason 05] PTE Write access is not set
[   19.599889] Registered led device: iwl-phy0:radio
[   19.599902] Registered led device: iwl-phy0:assoc
[   19.599915] Registered led device: iwl-phy0:RX
[   19.599928] Registered led device: iwl-phy0:TX
[   19.599940] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbd4000 
[   19.599941] DMAR:[fault reason 05] PTE Write access is not set
[   19.599945] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbd3000 
[   19.599946] DMAR:[fault reason 05] PTE Write access is not set
[   19.603489] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff8c000 
[   19.603490] DMAR:[fault reason 05] PTE Write access is not set
[   19.603532] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff8b000 
[   19.603533] DMAR:[fault reason 05] PTE Write access is not set
[   19.603549] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff88000 
[   19.603550] DMAR:[fault reason 05] PTE Write access is not set
[   19.603944] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff86000 
[   19.603945] DMAR:[fault reason 05] PTE Write access is not set
[   19.603978] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff85000 

(note that after the above and some more DMAR messages, iwlagn still works at
least well enough to associate with an AP and ssh out!)

Then later attempt to rsync hundreds of MB over eth0 (e1000e) and it falls
over:

[  269.369656] DMAR:[fault reason 05] PTE Write access is not set
[  269.779281] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff8d000 
[  269.779283] DMAR:[fault reason 05] PTE Write access is not set
[  269.880362] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff8d000 
[  269.880364] DMAR:[fault reason 05] PTE Write access is not set
[  269.982169] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff8d000 
[  269.982171] DMAR:[fault reason 05] PTE Write access is not set
[  270.394169] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd5b000 
[  270.394170] DMAR:[fault reason 05] PTE Write access is not set
[  271.927601] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffdbf000 
[  271.927603] DMAR:[fault reason 05] PTE Write access is not set
[  272.234770] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffdbf000 
[  272.234771] DMAR:[fault reason 05] PTE Write access is not set
[  272.337496] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff1f000 
[  272.337498] DMAR:[fault reason 05] PTE Write access is not set
[  273.259504] DMAR:[DMA Read] Request device [00:19.0] fault addr fff6c000 
[  273.259505] DMAR:[fault reason 06] PTE Read access is not set
[  274.816140] 0000:00:19.0: eth0: Detected Tx Unit Hang:
[  274.816141]   TDH                  <1a>
[  274.816148]   TDT                  <3>
[  274.816148]   next_to_use          <3>
[  274.816149]   next_to_clean        <17>
[  274.816150] buffer_info[next_to_clean]:
[  274.816150]   time_stamp           <ffffe5e2>
[  274.816151]   next_to_watch        <1a>
[  274.816152]   jiffies              <ffffe768>
[  274.816152]   next_to_watch.status <0>
[  275.307808] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd27000 
[  275.307810] DMAR:[fault reason 05] PTE Write access is not set
[  276.126152] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff0f000 
[  276.126154] DMAR:[fault reason 05] PTE Write access is not set
[  276.816323] 0000:00:19.0: eth0: Detected Tx Unit Hang:
[  276.816325]   TDH                  <1a>
[  276.816325]   TDT                  <3>
[  276.816326]   next_to_use          <3>
[  276.816326]   next_to_clean        <17>
[  276.816327] buffer_info[next_to_clean]:
[  276.816328]   time_stamp           <ffffe5e2>
[  276.816328]   next_to_watch        <1a>
[  276.816329]   jiffies              <ffffe95c>
[  276.816330]   next_to_watch.status <0>
[  278.788571] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd5f000 
[  278.788573] DMAR:[fault reason 05] PTE Write access is not set
[  278.816330] 0000:00:19.0: eth0: Detected Tx Unit Hang:
[  278.816331]   TDH                  <1a>
[  278.816331]   TDT                  <3>
[  278.816332]   next_to_use          <3>
[  278.816333]   next_to_clean        <17>
[  278.816333] buffer_info[next_to_clean]:
[  278.816334]   time_stamp           <ffffe5e2>
[  278.816335]   next_to_watch        <1a>
[  278.816335]   jiffies              <ffffeb50>
[  278.816336]   next_to_watch.status <0>
[  279.300570] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff17000 
[  279.300572] DMAR:[fault reason 05] PTE Write access is not set
[  280.324596] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffbef000 
[  280.324598] DMAR:[fault reason 05] PTE Write access is not set
[  280.631797] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff1f000 
[  280.631799] DMAR:[fault reason 05] PTE Write access is not set
[  280.816304] 0000:00:19.0: eth0: Detected Tx Unit Hang:
[  280.816305]   TDH                  <1a>
[  280.816306]   TDT                  <3>
[  280.816307]   next_to_use          <3>
[  280.816308]   next_to_clean        <17>
[  280.816309] buffer_info[next_to_clean]:
[  280.816310]   time_stamp           <ffffe5e2>
[  280.816316]   next_to_watch        <1a>
[  280.816317]   jiffies              <ffffed44>
[  280.816318]   next_to_watch.status <0>
[  280.939243] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff1f000 
[  280.939245] DMAR:[fault reason 05] PTE Write access is not set
[  281.860614] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd4b000 
[  281.860616] DMAR:[fault reason 05] PTE Write access is not set
[  282.372661] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff23000 
[  282.372662] DMAR:[fault reason 05] PTE Write access is not set
[  282.816260] ------------[ cut here ]------------
[  282.816263] WARNING: at net/sched/sch_generic.c:226
dev_watchdog+0xcd/0x184()
[  282.816266] Hardware name: Latitude E4300                  
[  282.816268] NETDEV WATCHDOG: eth0 (e1000e): transmit timed out
[  282.816271] Modules linked in: pci_slot sbp2 btusb iwlagn iwlcore rfkill
lib80211 button battery ac ohci1394 ieee1394 e1000e thermal fan
[  282.816288] Pid: 0, comm: swapper Not tainted 2.6.29-rc3-dmar #3
[  282.816291] Call Trace:
[  282.816293]  <IRQ>  [<ffffffff81044947>] warn_slowpath+0xd3/0x10f
[  282.816305]  [<ffffffff81486470>] ? _spin_lock_irqsave+0x36/0x3f
[  282.816316]  [<ffffffff8105f616>] ? getnstimeofday+0x58/0xb4
[  282.816318]  [<ffffffff81486550>] ? _spin_lock+0x17/0x1a
[  282.816320]  [<ffffffff81392227>] ? netif_tx_lock+0x72/0x8c
[  282.816323]  [<ffffffff81392241>] ? dev_watchdog+0x0/0x184
[  282.816325]  [<ffffffff8139230e>] dev_watchdog+0xcd/0x184
[  282.816328]  [<ffffffff81392241>] ? dev_watchdog+0x0/0x184
[  282.816331]  [<ffffffff8104dc85>] run_timer_softirq+0x1a3/0x232
[  282.816334]  [<ffffffff81049e31>] __do_softirq+0x8a/0x151
[  282.816338]  [<ffffffff810127dc>] call_softirq+0x1c/0x30
[  282.816340]  [<ffffffff81013860>] do_softirq+0x44/0x8f
[  282.816342]  [<ffffffff81049bb3>] irq_exit+0x3f/0x79
[  282.816345]  [<ffffffff81022e5b>] smp_apic_timer_interrupt+0x93/0xac
[  282.816348]  [<ffffffff810121b3>] apic_timer_interrupt+0x13/0x20
[  282.816350]  <EOI> <4>---[ end trace 6168acfc226aca2d ]---
[  282.912504] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff23000 
[  282.912505] DMAR:[fault reason 05] PTE Write access is not set
[  283.601446] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffb2b000 
[  283.601448] DMAR:[fault reason 05] PTE Write access is not set
[  283.703850] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffb2b000 
[  283.703852] DMAR:[fault reason 05] PTE Write access is not set
[  284.727905] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff0f000 
[  284.727907] DMAR:[fault reason 05] PTE Write access is not set
[  285.651214] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd1f000 
[  285.651216] DMAR:[fault reason 05] PTE Write access is not set
[  285.992987] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control:
RX/TX
[  288.004772] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff17000 
[  288.004773] DMAR:[fault reason 05] PTE Write access is not set
[  288.108245] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff17000 
[  288.108246] DMAR:[fault reason 05] PTE Write access is not set
[  289.438590] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd0b000 
[  289.438591] DMAR:[fault reason 05] PTE Write access is not set
[  289.643496] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd0b000 
[  289.643498] DMAR:[fault reason 05] PTE Write access is not set
[  290.871922] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffdbf000 
[  290.871924] DMAR:[fault reason 05] PTE Write access is not set
[  291.804344] 0000:00:19.0: eth0: Detected Tx Unit Hang:
[  291.804345]   TDH                  <0>
[  291.804346]   TDT                  <1>
[  291.804346]   next_to_use          <1>
[  291.804347]   next_to_clean        <0>
[  291.804347] buffer_info[next_to_clean]:
[  291.804348]   time_stamp           <fffff59a>
[  291.804349]   next_to_watch        <0>
[  291.804349]   jiffies              <fffff7ff>
[  291.804350]   next_to_watch.status <0>
[  292.817665] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd43000 
[  292.817667] DMAR:[fault reason 05] PTE Write access is not set
[  293.124854] DMAR:[DMA Write] Request device [0c:00.0] fault addr ffd43000 
[  293.124856] DMAR:[fault reason 05] PTE Write access is not set
[  293.227634] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff0f000 
[  293.227636] DMAR:[fault reason 05] PTE Write access is not set
[  294.148916] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff0f000 
[  294.148918] DMAR:[fault reason 05] PTE Write access is not set
[  294.251238] DMAR:[DMA Write] Request device [0c:00.0] fault addr fff0f000 
[  294.251240] DMAR:[fault reason 05] PTE Write access is not set

Similar failures on iwlagn, but they're somewhat harder to reproduce since I
have limited wifi bandwidth available.


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list