[Bugme-new] [Bug 10582] New: INFO: task pdflush:27505 blocked for more than 120 seconds.

bugme-daemon at bugzilla.kernel.org bugme-daemon at bugzilla.kernel.org
Thu May 1 02:30:37 PDT 2008


http://bugzilla.kernel.org/show_bug.cgi?id=10582

           Summary: INFO: task pdflush:27505 blocked for more than 120
                    seconds.
           Product: File System
           Version: 2.5
     KernelVersion: 2.6.25-git7 and later
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: XFS
        AssignedTo: xfs-masters at oss.sgi.com
        ReportedBy: pvp-lsts at fs.ru.acad.bg


Latest working kernel version: 2.6.25, maybe up to 2.6.25-git6
Earliest failing kernel version: 2.6.25-git7
Distribution: Bluewhite64
Hardware Environment: GA-P35-DS3R mobo, Intel E4300 CPU, 2 GB RAM, 500 GB WD
SATA HDD
Software Environment: KDE desktop
Problem Description: some processes get into a stall, effectively killing all
write access to the root partition (XFS); a lot of messeges in the logs like
this (if I wait long enough) :

[ 1032.940632] INFO: task xfsdatad/0:317 blocked for more than 120 seconds.
[ 1032.940638] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 1032.940642] xfsdatad/0    D ffff81000102cdc0     0   317      2
[ 1032.940683]  ffff81007f9bbdd0 0000000000000046 0000000000000000
0000000000000000
[ 1032.940691]  ffff81007fb335a0 ffffffff8089c4a0 ffff81007fb338e8
000000010000ea09
[ 1032.940697]  0000000000000000 0000000000000000 0000000000000003
0000000000000000
[ 1032.940703] Call Trace:
[ 1032.940711]  [<ffffffff803a1f60>] xfs_end_bio_delalloc+0x0/0x20
[ 1032.940717]  [<ffffffff806b8a29>] __down_write_nested+0x79/0xc0
[ 1032.940800]  [<ffffffff8037f125>] xfs_ilock+0xa5/0xe0
[ 1032.940811]  [<ffffffff803a1db0>] xfs_setfilesize+0x40/0xc0
[ 1032.940814]  [<ffffffff803a1f70>] xfs_end_bio_delalloc+0x10/0x20
[ 1032.940817]  [<ffffffff8024c8f0>] run_workqueue+0x140/0x220
[ 1032.940820]  [<ffffffff8024caa0>] worker_thread+0x0/0xd0
[ 1032.940822]  [<ffffffff8024cb31>] worker_thread+0x91/0xd0
[ 1032.940825]  [<ffffffff80250840>] autoremove_wake_function+0x0/0x30
[ 1032.940828]  [<ffffffff8024caa0>] worker_thread+0x0/0xd0
[ 1032.940830]  [<ffffffff8024caa0>] worker_thread+0x0/0xd0
[ 1032.940832]  [<ffffffff802503ab>] kthread+0x4b/0x80
[ 1032.940835]  [<ffffffff8020c428>] child_rip+0xa/0x12
[ 1032.940837]  [<ffffffff802504f4>] kthreadd+0x114/0x1a0
[ 1032.940839]  [<ffffffff80250360>] kthread+0x0/0x80
[ 1032.940940]  [<ffffffff8020c41e>] child_rip+0x0/0x12
[ 1032.940942] 
[ 1032.940943] INFO: lockdep is turned off.
[ 1032.940953] INFO: task pdflush:27505 blocked for more than 120 seconds.
[ 1032.940955] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 1032.940957] pdflush       D ffff81000102cdc0     0 27505      2
[ 1032.940960]  ffff810065bef4c0 0000000000000046 0000000000000000
ffffffff8040ce06
[ 1032.940964]  ffff81006dad5960 ffffffff8089c4a0 ffff81006dad5ca8
000000010000e9f2
[ 1032.940968]  0000000000036a20 000000008038c9f7 0000000000000003
ffffffff8038a767
[ 1032.940972] Call Trace:
[ 1032.940976]  [<ffffffff8040ce06>] __spin_lock_init+0x36/0x70
[ 1032.940979]  [<ffffffff8038a767>] xlog_grant_push_ail+0x47/0x160
[ 1032.940982]  [<ffffffff806b8a29>] __down_write_nested+0x79/0xc0
[ 1032.940984]  [<ffffffff8037f125>] xfs_ilock+0xa5/0xe0
[ 1032.940987]  [<ffffffff8038661b>] xfs_iomap_write_allocate+0x11b/0x3c0
[ 1032.940990]  [<ffffffff806b8ea1>] _spin_lock_irqsave+0x41/0x60
[ 1032.940993]  [<ffffffff8038742e>] xfs_iomap+0x23e/0x2d0
[ 1032.940995]  [<ffffffff803a2067>] xfs_map_blocks+0x37/0x90
[ 1032.940997]  [<ffffffff803a3576>] xfs_page_state_convert+0x296/0x640
[ 1032.941001]  [<ffffffff80253635>] ktime_get_ts+0x25/0x60
[ 1032.941003]  [<ffffffff806b9519>] _spin_unlock+0x29/0x50
[ 1032.941006]  [<ffffffff8025367c>] ktime_get+0xc/0x50
[ 1032.941008]  [<ffffffff803a3a58>] xfs_vm_writepage+0x68/0x110
[ 1032.941012]  [<ffffffff8027800e>] shrink_page_list+0x52e/0x680
[ 1032.941015]  [<ffffffff803ec57d>] blk_recount_segments+0x3d/0x80
[ 1032.941018]  [<ffffffff8026fc7b>] mempool_alloc+0x4b/0x140
[ 1032.941020]  [<ffffffff80277771>] isolate_lru_pages+0x1a1/0x240
[ 1032.941023]  [<ffffffff802782c4>] shrink_inactive_list+0x164/0x450
[ 1032.941026]  [<ffffffff80278993>] shrink_zone+0xb3/0x130
[ 1032.941028]  [<ffffffff8027919f>] try_to_free_pages+0x24f/0x3d0
[ 1032.941031]  [<ffffffff80277810>] isolate_pages_global+0x0/0x40
[ 1032.941034]  [<ffffffff802728f5>] __alloc_pages_internal+0x1b5/0x460
[ 1032.941036]  [<ffffffff80272c35>] __get_free_pages+0x15/0x60
[ 1032.941038]  [<ffffffff803a1b5b>] kmem_alloc+0x5b/0x100
[ 1032.941041]  [<ffffffff8038410a>] xfs_iflush_cluster+0x4a/0x3b0
[ 1032.941043]  [<ffffffff806b9519>] _spin_unlock+0x29/0x50
[ 1032.941046]  [<ffffffff80383049>] xfs_iflush_int+0x2d9/0x340
[ 1032.941048]  [<ffffffff803846e0>] xfs_iflush+0x270/0x310
[ 1032.941052]  [<ffffffff8039bea1>] xfs_inode_flush+0xb1/0xe0
[ 1032.941055]  [<ffffffff803ab8d5>] xfs_fs_write_inode+0x25/0x70
[ 1032.941058]  [<ffffffff802b901f>] __writeback_single_inode+0x25f/0x350
[ 1032.941061]  [<ffffffff806b9519>] _spin_unlock+0x29/0x50
[ 1032.941064]  [<ffffffff803899aa>] xfs_log_need_covered+0x7a/0xd0
[ 1032.941066]  [<ffffffff802b9577>] sync_sb_inodes+0x207/0x310
[ 1032.941069]  [<ffffffff802b98d2>] writeback_inodes+0xa2/0xf0
[ 1032.941071]  [<ffffffff80273df6>] wb_kupdate+0xa6/0x120
[ 1032.941073]  [<ffffffff80274ee0>] pdflush+0x0/0x1f0
[ 1032.941076]  [<ffffffff80274ee0>] pdflush+0x0/0x1f0
[ 1032.941078]  [<ffffffff80275001>] pdflush+0x121/0x1f0
[ 1032.941080]  [<ffffffff80273d50>] wb_kupdate+0x0/0x120
[ 1032.941082]  [<ffffffff802503ab>] kthread+0x4b/0x80
[ 1032.941084]  [<ffffffff8020c428>] child_rip+0xa/0x12
[ 1032.941087]  [<ffffffff802504f4>] kthreadd+0x114/0x1a0
[ 1032.941089]  [<ffffffff80250360>] kthread+0x0/0x80
[ 1032.941091]  [<ffffffff8020c41e>] child_rip+0x0/0x12
[ 1032.941092] 
[ 1032.941093] INFO: lockdep is turned off.

After this - the system becomes unusable - I've been able for example to switch
to the text console, and issue a "reboot" command - it didn't work; if I try to
start SeaMonkey no long enough after the stall occurs - it works, but not for
long - it stalls too if I try to something, requiring write access to the root
partition; I tried "sync"-ing - it blocks, too.

Steps to reproduce: not sure exactly; to me it looks like a "git pull" command
is triggering the problem; of all the other programs running maybe notable is
ktorrent, but it happened once even when ktorrent was not started; On the
system the problem occurs there are only XFS and NTFS partitions - I used the
NTFS to save some of dmesg's output, since the XFS ones were of no use

See also: http://lkml.org/lkml/2008/4/30/176 - this is an email I sent to ask
for help on how to debug the problem, but as I see it - all the devs are busy
right now due to the merge window - hence, this bug-report


-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


More information about the Bugme-new mailing list