Hang (due to HW?) in qi_submit_sync()

Roland Dreier roland at kernel.org
Tue Jan 6 00:57:20 UTC 2015


From: Roland Dreier <roland at purestorage.com>

Hi, we're running kernel 3.10.59 (pretty recent long-term kernel) on a
2-socket Xeon E5 v3 (Haswell) system.  We're using vfio to access some
PCI devices from userspace, and occasionally when we kill a process,
we see the system hang in qi_submit_sync().

Based on a very old patch from Intel <https://lkml.org/lkml/2009/5/20/341>,
we added code to the dmar driver:

int qi_submit_sync(struct qi_desc *desc, struct intel_iommu *iommu )
{

//...

	/*
	 * update the HW tail register indicating the presence of
	 * new descriptors.
	 */
	writel(qi->free_head << DMAR_IQ_SHIFT, iommu->reg + DMAR_IQT_REG);

	start_time = get_cycles();
	while (qi->desc_status[wait_index] != QI_DONE) {
		/*
		 * We will leave the interrupts disabled, to prevent interrupt
		 * context to queue another cmd while a cmd is already submitted
		 * and waiting for completion on this cpu. This is to avoid
		 * a deadlock where the interrupt context can wait indefinitely
		 * for free slots in the queue.
		 */
		rc = qi_check_fault(iommu, index);
		if (rc)
			break;

		raw_spin_unlock(&qi->q_lock);

// We added this -->
		if (get_cycles() - start_time > DMAR_OPERATION_TIMEOUT) {
			printk(KERN_EMERG "desc_status[%d] = %d.\n",
			       wait_index, qi->desc_status[wait_index]);
/* line 888: */		BUG();
		}
// <-- to here

		cpu_relax();
		raw_spin_lock(&qi->q_lock);
	}

and indeed when the system hangs, we see for example

    desc_status[69] = 1.
    ------------[ cut here ]------------
    kernel BUG at drivers/iommu/dmar.c:888!
    CPU: 8 PID: 12211 Comm: foed Tainted: P           O 3.10.59+ #201412290537+4e4984e.platinum
    task: ffff88275ac643e0 ti: ffff8825d329a000 task.ti: ffff8825d329a000
    RIP: 0010:[<ffffffff81529737>]  [<ffffffff81529737>] qi_submit_sync+0x3f7/0x490
    RSP: 0018:ffff8825d329ba10  EFLAGS: 00010092
    RAX: 0000000000000014 RBX: 0000000000000044 RCX: ffff881fffb0ec00
    RDX: 0000000000000000 RSI: ffff881fffb0d048 RDI: 0000000000000046
    RBP: ffff8825d329ba78 R08: ffffffffffffffff R09: 000000000001a4a1
    R10: 0000000000000051 R11: 00000000000000e4 R12: 00007068faa64fc8
    R13: ffff881fff40c780 R14: 0000000000000114 R15: ffff883ffec01a00
    FS:  00007f3c86ffb700(0000) GS:ffff881fffb00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f996d3f1ba0 CR3: 00000026222f0000 CR4: 00000000001407e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Stack:
     ffff8825d329ba88 0000000000000450 0000000000000440 ffff881ff3215000
     00000044d329bb18 0000000000000086 0000000000000044 ffff882500000045
     ffff881ff12b1600 0000000000000000 0000000000000246 ffff881ff278e858
    Call Trace:
     [<ffffffff8152f6b5>] free_irte+0xc5/0x100
     [<ffffffff81530834>] free_remapped_irq+0x44/0x60
     [<ffffffff81027b23>] destroy_irq+0x33/0xd0
     [<ffffffff81027ede>] native_teardown_msi_irq+0xe/0x10
     [<ffffffff812a6a70>] default_teardown_msi_irqs+0x60/0x80
     [<ffffffff812a64d9>] free_msi_irqs+0x99/0x150
     [<ffffffff812a749d>] pci_disable_msix+0x3d/0x60
     [<ffffffffa0078748>] vfio_msi_disable+0xc8/0xe0 [vfio_pci]
     [<ffffffffa0078f86>] vfio_pci_set_msi_trigger+0x2a6/0x2d0 [vfio_pci]
     [<ffffffffa007941c>] vfio_pci_set_irqs_ioctl+0x8c/0xa0 [vfio_pci]
     [<ffffffffa00773b0>] vfio_pci_release+0x70/0x150 [vfio_pci]
     [<ffffffffa006dcbc>] vfio_device_fops_release+0x1c/0x40 [vfio]
     [<ffffffff8114d7db>] __fput+0xdb/0x220
     [<ffffffff8114d92e>] ____fput+0xe/0x10
     [<ffffffff810614ac>] task_work_run+0xbc/0xe0
     [<ffffffff81043d0e>] do_exit+0x3ce/0xe50
     [<ffffffff8104557f>] do_group_exit+0x3f/0xa0
     [<ffffffff81054769>] get_signal_to_deliver+0x1a9/0x5b0
     [<ffffffff810023f8>] do_signal+0x48/0x5e0

as far as I can understand the driver, this is a "shouldn't happen,
your hardware is broken" occurrence.  However I haven't been able to
find any relevant looking sightings for our CPU.

Does anyone from Intel (or elsewhere) have any suggestions on how to
chase this further?

Thanks!
  Roland


More information about the iommu mailing list