• mdadm
  • recovery

mdadm failed drive replacement

So, the array that has been happily working for you for an age suddenly has a failed HDD. What do you do? Start Googling for hours and find that most of the discussion on this topic involves simulating drive failure, carefully telling mdadm that the drive is in a failed state and then reintroducing that same drive. Totally pointless. It is fairly rare for a HDD to politely announce it is due to fail and then suddenly fix itself.

This is how to deal with a drive that has done what drives do sometimes: suddenly fail. I'm using CentOS 6.5.

First of all you need to establish which drive has failed from the system's point of view. Do this:

cat /proc/mdstat

That should tell you which drive has gone from mdadm's point of view. Let's say it's /dev/sdd. I'll try to help a little in regards to how to best map system labels to physical hard drives, but right now this is focusing on getting mdadm sorted.

So, you know which physical drive has failed. Turn off the system, replace it. Turn it back on.

If you issue an fdisk -l command you should see your new drive as /dev/sdd with no partitions. mdadm needs these creating before it'll let you add the new drive to the degraded array. In this example AND PLEASE CHANGE THIS TO SUIT YOUR SITUATION I want sdd to have the same partition layout as sdc. Take the following steps:

yum install gdisk

sgdisk -R /dev/sdd /dev/sdc

sgdisk -G /dev/sdd

Please take note that the target drive, the new one, comes first in that command. Target then source. That command takes the sdc partition layout and copies it to sdd. The last command randomises the GUID so as to avoid any conflicts within the system.

EDIT: If the previous step didn't work for you, here's a plan B:

sgdisk --backup=table /dev/sdc

sgdisk --load-backup=table /dev/sdd

sgdisk -G /dev/sdd

The new HDD is now prepared. We will now add it to the array - the easy part.

Do a cat/proc/mdstat again and take a note of the md arrays (md0, md1 etc.). Pay attention to which partition the existing drives use per md device. For example, is md1 made up of sda1, sdb1, sdc1? If so, we'll add sdd1 next. Use the following commands to add the new HDD into the degraded arrays:

mdadm --manage /dev/mdX -a /dev/sddY

Where X and Y fit your situation. The RAID will now rebuild.


Posted by