[Bugme-new] [Bug 40552] New: DRBD sync triggers lock inversion complaint

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Thu Aug 4 11:56:50 PDT 2011


https://bugzilla.kernel.org/show_bug.cgi?id=40552

           Summary: DRBD sync triggers lock inversion complaint
           Product: IO/Storage
           Version: 2.5
    Kernel Version: 3.1-rc0
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
        AssignedTo: io_other at kernel-bugs.osdl.org
        ReportedBy: bvanassche at acm.org
        Regression: No


Kernel version: v3.0-7347-g288d5ab (most recent commit was
288d5abec8314ae50fe6692f324b0444acae8486 with date August 3, 2011).

While syncing a DRBD volume (on top of an LVM logical volume) for the first
time the following complaint appeared in the kernel log:

block drbd0: Implicitly set pdsk from Inconsistent to UpToDate
block drbd0: conn( Connected -> VerifyS ) disk( Inconsistent -> UpToDate )
pdsk( Inconsistent -> UpToDate ) 
block drbd0: Starting Online Verify from sector 0

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.0.0-scst-debug+ #6
---------------------------------------------------------
drbd0_receiver/5871 just changed the state of lock:
(&(&mdev->req_lock)->rlock){..-...}, at: [<ffffffffa023e072>]
drbd_endio_read_sec_final+0x42/0x330 [drbd]
but this lock took another, SOFTIRQ-unsafe lock in the past:
(&(&mdev->peer_seq_lock)->rlock){+.+...}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
Possible interrupt unsafe locking scenario:

      CPU0                    CPU1
      ----                    ----
 lock(&(&mdev->peer_seq_lock)->rlock);
                              local_irq_disable();
                              lock(&(&mdev->req_lock)->rlock);
                              lock(&(&mdev->peer_seq_lock)->rlock);
 <Interrupt>
   lock(&(&mdev->req_lock)->rlock);

*** DEADLOCK ***

no locks held by drbd0_receiver/5871.

the shortest dependencies between 2nd lock and 1st lock:
-> (&(&mdev->peer_seq_lock)->rlock){+.+...} ops: 3 {
   HARDIRQ-ON-W at:
                         [<ffffffff81088b14>] __lock_acquire+0x754/0x1d70
                         [<ffffffff8108a725>] lock_acquire+0x85/0x130
                         [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                         [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0
[drbd]
                         [<ffffffffa02689e9>]
drbd_connector_callback+0x139/0x750 [drbd]
                         [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                         [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                         [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                         [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                         [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                         [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
   SOFTIRQ-ON-W at:
                         [<ffffffff81088b4d>] __lock_acquire+0x78d/0x1d70
                         [<ffffffff8108a725>] lock_acquire+0x85/0x130
                         [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                         [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0
[drbd]
                         [<ffffffffa02689e9>]
drbd_connector_callback+0x139/0x750 [drbd]
                         [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                         [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                         [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                         [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                         [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                         [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
   INITIAL USE at:
                        [<ffffffff810887a4>] __lock_acquire+0x3e4/0x1d70
                        [<ffffffff8108a725>] lock_acquire+0x85/0x130
                        [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                        [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0
[drbd]
                        [<ffffffffa02689e9>]
drbd_connector_callback+0x139/0x750 [drbd]
                        [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                        [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                        [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                        [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                        [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                        [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
 }
 ... key      at: [<ffffffffa0275a5c>] __key.46241+0x0/0xffffffffffff35a4
[drbd]
 ... acquired at:
  [<ffffffff8108a725>] lock_acquire+0x85/0x130
  [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
  [<ffffffff81048429>] try_to_wake_up+0x39/0x300
  [<ffffffff81048702>] default_wake_function+0x12/0x20
  [<ffffffff810721cd>] autoremove_wake_function+0x1d/0x50
  [<ffffffff81034b49>] __wake_up_common+0x59/0x90
  [<ffffffff81037d98>] __wake_up+0x48/0x70
  [<ffffffffa02421b2>] got_RqSReply+0x42/0x90 [drbd]
  [<ffffffffa024d957>] drbd_asender+0x3b7/0x5f0 [drbd]
  [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
  [<ffffffff81071c06>] kthread+0x96/0xa0
  [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10

-> (&(&mdev->req_lock)->rlock){..-...} ops: 145 {
  IN-SOFTIRQ-W at:
                       [<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
                       [<ffffffff8108a725>] lock_acquire+0x85/0x130
                       [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
                       [<ffffffffa023e072>]
drbd_endio_read_sec_final+0x42/0x330 [drbd]
                       [<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
                       [<ffffffff811611bd>] bio_endio+0x1d/0x40
                       [<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
                       [<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
                       [<ffffffff811611bd>] bio_endio+0x1d/0x40
                       [<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
                       [<ffffffff811c3640>] blk_update_request+0x100/0x570
                       [<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
                       [<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
                       [<ffffffff811c3d60>] blk_end_request+0x10/0x20
                       [<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
                       [<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
                       [<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
                       [<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
                       [<ffffffff81055839>] __do_softirq+0xc9/0x230
                       [<ffffffff81407bcc>] call_softirq+0x1c/0x30
                       [<ffffffff81004c0d>] do_softirq+0x8d/0xc0
                       [<ffffffff81055d0e>] irq_exit+0x9e/0xc0
                       [<ffffffff81408176>] do_IRQ+0x66/0xe0
                       [<ffffffff81405070>] ret_from_intr+0x0/0x13
                       [<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
                       [<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
                       [<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
                       [<ffffffff811c545f>] __make_request+0x32f/0x350
                       [<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
                       [<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
                       [<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850
[drbd]
                       [<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
                       [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
                       [<ffffffff81071c06>] kthread+0x96/0xa0
                       [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10
  INITIAL USE at:
                      [<ffffffff810887a4>] __lock_acquire+0x3e4/0x1d70
                      [<ffffffff8108a725>] lock_acquire+0x85/0x130
                      [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
                      [<ffffffffa025d554>] drbd_req_state+0x74/0x4a0 [drbd]
                      [<ffffffffa025d9b2>] _drbd_request_state+0x32/0xe0 [drbd]
                      [<ffffffffa026674c>] drbd_nl_disk_conf+0x72c/0x11c0
[drbd]
                      [<ffffffffa02689e9>] drbd_connector_callback+0x139/0x750
[drbd]
                      [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                      [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                      [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                      [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                      [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                      [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
}
... key      at: [<ffffffffa0275a64>] __key.46240+0x0/0xffffffffffff359c [drbd]
... acquired at:
  [<ffffffff81087564>] check_usage_forwards+0x94/0x110
  [<ffffffff8108816b>] mark_lock+0x19b/0x3f0
  [<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
  [<ffffffff8108a725>] lock_acquire+0x85/0x130
  [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
  [<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
  [<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
  [<ffffffff811611bd>] bio_endio+0x1d/0x40
  [<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
  [<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
  [<ffffffff811611bd>] bio_endio+0x1d/0x40
  [<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
  [<ffffffff811c3640>] blk_update_request+0x100/0x570
  [<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
  [<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
  [<ffffffff811c3d60>] blk_end_request+0x10/0x20
  [<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
  [<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
  [<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
  [<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
  [<ffffffff81055839>] __do_softirq+0xc9/0x230
  [<ffffffff81407bcc>] call_softirq+0x1c/0x30
  [<ffffffff81004c0d>] do_softirq+0x8d/0xc0
  [<ffffffff81055d0e>] irq_exit+0x9e/0xc0
  [<ffffffff81408176>] do_IRQ+0x66/0xe0
  [<ffffffff81405070>] ret_from_intr+0x0/0x13
  [<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
  [<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
  [<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
  [<ffffffff811c545f>] __make_request+0x32f/0x350
  [<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
  [<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
  [<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850 [drbd]
  [<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
  [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
  [<ffffffff81071c06>] kthread+0x96/0xa0
  [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10


stack backtrace:
Pid: 5871, comm: drbd0_receiver Not tainted 3.0.0-scst-debug+ #6
Call Trace:
<IRQ>  [<ffffffff81087486>] print_irq_inversion_bug+0x1b6/0x200
[<ffffffff810874d0>] ? print_irq_inversion_bug+0x200/0x200
[<ffffffff81087564>] check_usage_forwards+0x94/0x110
[<ffffffff8108816b>] mark_lock+0x19b/0x3f0
[<ffffffff81404d25>] ? _raw_spin_unlock_irqrestore+0x65/0x80
[<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
[<ffffffff81125b97>] ? __slab_free+0x377/0x440
[<ffffffff8112500a>] ? free_debug_processing+0x1ca/0x350
[<ffffffff810e4f00>] ? mempool_free_pages+0x10/0x10
[<ffffffff8108b1be>] ? trace_hardirqs_on_caller+0xde/0x1a0
[<ffffffff8108a725>] lock_acquire+0x85/0x130
[<ffffffffa023e072>] ? drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffff8140447c>] ? _raw_spin_lock_irqsave+0x1c/0x60
[<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
[<ffffffffa023e072>] ? drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
[<ffffffff811611bd>] bio_endio+0x1d/0x40
[<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
[<ffffffffa022311c>] ? dec_pending+0x4c/0x3c0 [dm_mod]
[<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
[<ffffffff811611bd>] bio_endio+0x1d/0x40
[<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
[<ffffffff811c3640>] blk_update_request+0x100/0x570
[<ffffffff811c38c2>] ? blk_update_request+0x382/0x570
[<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
[<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
[<ffffffff811c3d60>] blk_end_request+0x10/0x20
[<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
[<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
[<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
[<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
[<ffffffff81055839>] __do_softirq+0xc9/0x230
[<ffffffff81407bcc>] call_softirq+0x1c/0x30
[<ffffffff81004c0d>] do_softirq+0x8d/0xc0
[<ffffffff81055d0e>] irq_exit+0x9e/0xc0
[<ffffffff81408176>] do_IRQ+0x66/0xe0
[<ffffffff812ab860>] ? perf_trace_scsi_cmd_done_timeout_template+0x1b0/0x1b0
[<ffffffff81405070>] common_interrupt+0x70/0x70
<EOI>  [<ffffffff81404d27>] ? _raw_spin_unlock_irqrestore+0x67/0x80
[<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
[<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
[<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
[<ffffffff811c545f>] __make_request+0x32f/0x350
[<ffffffffa0224193>] ? dm_request+0x33/0x280 [dm_mod]
[<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
[<ffffffff81161a64>] ? __bio_add_page+0x114/0x220
[<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
[<ffffffff81043609>] ? sub_preempt_count+0xa9/0xe0
[<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850 [drbd]
[<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
[<ffffffffa02551f0>] ? we_should_drop_the_connection+0xd0/0xd0 [drbd]
[<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
[<ffffffffa02551f0>] ? we_should_drop_the_connection+0xd0/0xd0 [drbd]
[<ffffffff81071c06>] kthread+0x96/0xa0
[<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10
[<ffffffff81038383>] ? finish_task_switch+0x83/0xe0
[<ffffffff81404c9b>] ? _raw_spin_unlock_irq+0x3b/0x60
[<ffffffff81405121>] ? retint_restore_args+0xe/0xe
[<ffffffff81071b70>] ? __init_kthread_worker+0x70/0x70
[<ffffffff81407ad0>] ? gs_change+0xb/0xb
block drbd0: Out of sync: start=8, size=712 (sectors)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


More information about the Bugme-new mailing list