[Bugme-new] [Bug 21392] New: Incorrect assembly of raid partitions on boot

Thu Oct 28 17:00:02 PDT 2010

https://bugzilla.kernel.org/show_bug.cgi?id=21392

           Summary: Incorrect assembly of raid partitions on boot
           Product: IO/Storage
           Version: 2.5
    Kernel Version: All, 2.6.36
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: MD
        AssignedTo: io_md at kernel-bugs.osdl.org
        ReportedBy: chad.farmer at bull.com
        Regression: No

The problem is that autorun_devices in md.c builds a candidates list of
partitions and calls bind_rdev_to_array in the order the partitions were found,
without regard for the state of the partition.  Function bind_rdev_to_array
requires a unique mdk_rdev_t desc_nr value, so when partitions exist with the
same desc_nr in their superblock (sb->this_disk.number), duplicates are
rejected.  The rejected duplicate may be the current device that is needed to
assemble the array.

The following test scenario demonstrates this problem.

Create raid1 group across three drives, sda1 primary, sdb1 secondary, sdc1
spare.  For simplicity I did not use LVM.  Use "mdamd --fail" command on sda1. 
After sdb1 resyncs with sdc1, reboot.  After booting, the raid is running with
a single partition, sdc1.  The following messages report the problem (In this
case, the system also has an sdd1, but it is not current.

Oct 27 10:04:58 hms1 kernel: [   28.570081] md: considering sdd1 ...
Oct 27 10:04:59 hms1 kernel: [   28.573747] md:  adding sdd1 ...
Oct 27 10:04:59 hms1 kernel: [   28.577065] md:  adding sdc1 ...
Oct 27 10:04:59 hms1 kernel: [   28.580384] md:  adding sdb1 ...
Oct 27 10:04:59 hms1 kernel: [   28.583706] md:  adding sda1 ...
All four partitions were put into the candidates list.
Oct 27 10:04:59 hms1 kernel: [   28.587058] md: created md0
Oct 27 10:04:59 hms1 kernel: [   28.589942] md: bind<sda1>
The failed sda1 is descr_nr 0.
Oct 27 10:04:59 hms1 kernel: [   28.592744] md: bind<sdb1>
The current secondary sdb1 is descr_nr 1.
Oct 27 10:04:59 hms1 kernel: [   28.595547] md: export_rdev(sdc1)
The current primary sdc1 is rejected due to duplicate desc_nr 0.
Oct 27 10:04:59 hms1 kernel: [   28.598953] md: export_rdev(sdd1)
Oct 27 10:04:59 hms1 kernel: [   28.602359] md: running: <sdb1><sda1>
Oct 27 10:04:59 hms1 kernel: [   28.606205] md: kicking non-fresh sda1 from
array! events 24, 10
This is correct, sda1 is really not fresh.
Oct 27 10:04:59 hms1 kernel: [   28.612304] md: unbind<sda1>
Oct 27 10:04:59 hms1 kernel: [   28.619049] md: export_rdev(sda1)
Oct 27 10:04:59 hms1 kernel: [   28.623113] raid1: raid set md0 active with 1
out of 2 mirrors
This is wrong.  The current primary partition, sdc1, is present and
operational, but was not picked up.

I confess that I found this on an older RedHat 5.3 kernel, but the 2.6.36 md.c
module has the same code.  If I've incorrectly analyzed this, please enlighten
me.

I've seen this situation on a production system where an unrecovered I/O caused
sda1 to be failed (and the device disabled in Linux).  The recovery was
correct.  On the next boot (done with a power cycle), the sda disk was again
operational, the sb readable, and the raid was incorrectly assembled.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.