[Bugme-new] [Bug 17491] New: Reproducible crash on large 64bit write to sata device
bugzilla-daemon at bugzilla.kernel.org
bugzilla-daemon at bugzilla.kernel.org
Mon Aug 30 11:27:32 PDT 2010
https://bugzilla.kernel.org/show_bug.cgi?id=17491
Summary: Reproducible crash on large 64bit write to sata device
Product: IO/Storage
Version: 2.5
Kernel Version: 2.6.31 64bit, 2.6.35 64bit
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: Serial ATA
AssignedTo: jgarzik at pobox.com
ReportedBy: carl.janzen at gmail.com
Regression: No
Created an attachment (id=28461)
--> (https://bugzilla.kernel.org/attachment.cgi?id=28461)
Ubuntu 9.10 livecd dmesg
This is a bug affecting recent 64bit kernels, including the kernel in Ubuntu
10.10 alpha 3 (kernel 2.6.35). The SMART data from the involved Brand new 2TB
Western Digital hard drive shows no errors (Motherboard is a brand new Asus P5Q
Pro Turbo). I tried it on an older Hard drive (also 2TB Western Digital) and
earlier motherboard (asus p5b) with the same results.
This error likely affects other hard drives, or most likely has nothing to do
with the hard drives at all. I had a problem with a corruption of my 4-drive
array of 500MB Western Digital drives. Before rebuilding the array I wanted to
copy the data from those drives across to a 2TB backup and that's when I
started seeing this reproducible crash. Leading up to this point the system did
experience crashes every other day or so, which suggests to me that the bug
probably caused that file system corruption also.
The kernel on the Fedora 10 live dvd does not crash ( 2.6.27 ) but I didn't
confirm whether it produces the messages as described below.
The kernel on Ubuntu 9.10 live cd does not crash either (2.6.31-14-generic ) ,
but produced the enclosed dmesg, messages, lspci and smartctl files. Judging by
the attempts to access blocks past the end of the device, it looks like a 64bit
specific problem. Convert the number to hex and it stands out conspicuously.
The latest ubuntu distribution freezes up with keyboard LEDs flashing. I tried
to reproduce the problerm in text mode so I could take a picture of the
trace/panic. That's what the two JPGs are.
The way I have been triggering the bug is with the following command
dd if=/dev/zero of=/dev/sdb1 bs=2048
There does not seem to be a predictable delay between the start of that command
and when it actually crashes/freezes or produces the errors in the log file.
Sometimes I can transfer 40GB before the error hits. Once it happened
immediately. I also noticed that upon detection of the device there is a
complaint "device reported invalid CHS sector 0"
--
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the Bugme-new
mailing list