Koozali.org: home of the SME Server

Fun with RAID1.... [UUU] and spare drives

Offline ReetP

  • *
  • 3,722
  • +5/-0
Fun with RAID1.... [UUU] and spare drives
« on: April 19, 2012, 01:00:31 PM »
I have had two scenarios recently that I thought I would post my solutions for those that may be interested. If anyone feels that any of this should go in the RAID wiki page then tell me and I'll add it.

I had previously added a 3rd drive to a 2 drive RAID1 array. However, at some point this had gone pear shaped. How and why I do not know at the minute. But I ended up with my array looking like this :

cat /proc/mdstat revealed :

Personalities : [raid1]
md1 : active raid1 sdc1[2] sdb1[1] sda1[0]
      104320 blocks [3/3] [UUU]

md2 : active raid1 sdc2(S) sdb2[1] sda2[0]
      244091520 blocks [2/2] [UU]

unused devices: <none>

I had somehow gotten 3 drives in the first partition array md1 instead of a hot spare as per md2. (I have literally just read the following post and think that this may now be a bug : http://forums.contribs.org/index.php/topic,48362.msg240019.html#msg240019 - I will report it accordingly but am posting my solution here anyway.)

I wasn't aware that I could do this and a lot of searching on how to cure the problem finally led me to post a question on the linux-raid list.

The answer is simple. mdadm doesn't make a decision about how many drives you can have or what sort of array they can be in. So RAID1 mirrored drives can have 3 drives or more if you so require.

Neil Brown provided this explanation & solution :

Quote
You don't want that?  Change it.

   mdadm --grow /dev/md1 --raid-disks=4

now it has 4 devices - though one will be missing.

   mdadm --grow /dev/md1 --raid-disks=2

now it has 2 devices.  Actually that won't work until you mark one of the
devices as failed, so

   mdadm /dev/md1 --fail /dev/sdc1
   mdadm --grow /dev/md1 --raid-disks=2

So to shrink an array and remove devices, you have to grow it. Illogical in the extreme.

================================================================================================


On my other machine I seemed to suffer a problem with adding a spare drive.

I originally filed this as bug but it is considered NOTABUG
Extra hard drive installed but not detected as spare
http://bugs.contribs.org/show_bug.cgi?id=6904

A second bug was created and amendment made in the wiki RAID page
Extra hard drive installed but not detected as spare
http://bugs.contribs.org/show_bug.cgi?id=6907

As per the wiki, adding another drive as a spare is possible, but not a 'supported' option. There is a related bug/NFR here :

Automatically detect new drive on bootup and add it to degraded RAID array
http://bugs.contribs.org/show_bug.cgi?id=5586

However this ONLY covers the situation where one drive has failed and been replaced by a new one. It does not cover the addition of a spare to the array.

I thought I could be smart and use /sbin/e-smith/add_drive_to_raid

But after running it I looked at the partition table and I saw something like this :

Disk /dev/sda: 250.0 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104384+  fd  Linux raid autodetect
Partition 1 does not end on cylinder boundary.
/dev/sda2              13       30401   244091647   fd  Linux raid autodetect

See the following bugs for some more info on the subject :

Added new larger HDD to RAID1 array and syncs, but can't boot from drive
http://bugs.contribs.org/show_bug.cgi?id=6630
Cloned to v8 :
http://bugs.contribs.org/show_bug.cgi?id=6911

It is to do with modern drives using LBA and not CHS.

On top of that, I could not seem to get grub to install correctly.

My eventual workaround was to manually add the partitions with :

Code: [Select]
sfdisk -d /dev/hda | sfdisk /dev/sda
Then sort out /etc/mdadm.conf - take a backup first and then something like this should do it:

Code: [Select]
echo "DEVICE partitions" > /etc/mdadm.conf
mdadm --examine --scan >> /etc/mdadm.conf

I could then add the drive to the array with mdadm :

Code: [Select]
mdadm --manage /dev/md1 -a /dev/sda1
mdadm --manage /dev/md2 -a /dev/sda2

I made sure that the drive was in the /boot/grub/device.map (back up the file first):

Code: [Select]
grub --device-map=/boot/grub/device.map
Should give you something like this - you can manually edit it :

(hd0)     /dev/hda
(hd1)     /dev/hdc
(hd2)     /dev/sda

Then ran grub manually

Code: [Select]
grub
root (0,0)
setup (0)
root (1,0)
setup (1)
root (2,0)
setup (0)
quit

I then found that the drive was not remembered across boots. This was because the new drive was SATA whereas the original drives were IDE / PATA

I finally figured that this was because the SATA driver was not loaded early enough to pick up the new drive so I needed to rebuild initrd (all sounds very professional but it's all on a wing and a prayer and makes me quake at the knees !!!)

Before you do this it would be worth making a backup of your boot files and ensure that you have at least one working configuration, and a rescue CD as well :-)

Then you need to do something like this :

Code: [Select]
ls /boot/initrd-*.img | sed 's,.*initrd-\(.*\)\.img,\1,' | while read initrd; do     mkinitrd -f /boot/initrd-$initrd.img $initrd; done
That will recreate your initrd files with the SATA driver built in and you should be able to reboot with the drive as a hot spare in the array. Then test it all by 'failing' drives and see what happens. Make sure the data is backed up first.......

Upshot is that I now have my Hot Spare in my array and I can sleep a little easier at night.
...
1. Read the Manual
2. Read the Wiki
3. Don't ask for support on Unsupported versions of software
4. I have a job, wife, and kids and do this in my spare time. If you want something fixed, please help.

Bugs are easier than you think: http://wiki.contribs.org/Bugzilla_Help

If you love SME and don't want to lose it, join in: http://wiki.contribs.org/Koozali_Foundation