Once the disk is failed (or is actually defective), mdadm will automatically remove it from the RAID. After that, you’ll either have to add the disk back as a data disk or as a hot-spare (which was in my case). Now, here’s after the rebuild for the failed disk started:
1
2
3
4
5
6
7
8
| root:(charon.ka.heimdaheim.de) PWD:~
Sun Jul 27, 23:40:35 [0] > cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdi1[0] sdh1[6] sdj1[7] sdk1[8] sdf1[9] sdb1[5] sde1[4] sdd1[2] sdc1[1]
15626121216 blocks super 1.2 level 5, 512k chunk, algorithm 2 [9/8] [UUUUU_UUU]
[===================>.] recovery = 97.9% (1913176308/1953265152) finish=12.3min speed=53936K/sec
unused devices: <none>
|
In order to add the replaced disk back to the RAID, you’ll have to prepare a partition for it (see this post for more details). After that, it’s a simple call with mdadm to re-add the hot-spare:
1
2
3
4
5
6
7
8
9
10
11
| root:(charon.ka.heimdaheim.de) PWD:~
Sun Jul 27, 23:44:19 [0] > mdadm --add /dev/md127 /dev/sdg1
mdadm: added /dev/sdg1
root:(charon.ka.heimdaheim.de) PWD:~
Sun Jul 27, 23:44:37 [0] > cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md127 : active raid5 sdg1[10](S) sdi1[0] sdh1[6] sdj1[7] sdk1[8] sdf1[9] sdb1[5] sde1[4] sdd1[2] sdc1[1]
15626121216 blocks super 1.2 level 5, 512k chunk, algorithm 2 [9/8] [UUUUU_UUU]
[===================>.] recovery = 98.6% (1926141684/1953265152) finish=8.6min speed=52224K/sec
unused devices: <none>
|
As you can see, disk 10 (sdg1 in this example) has been added with the tag hot-spare … mdadm –detail shows that a bit better:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
| root:(charon.ka.heimdaheim.de) PWD:~
Sun Jul 27, 23:44:46 [0] > mdadm --detail /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Sat Jan 26 18:35:19 2013
Raid Level : raid5
Array Size : 15626121216 (14902.23 GiB 16001.15 GB)
Used Dev Size : 1953265152 (1862.78 GiB 2000.14 GB)
Raid Devices : 9
Total Devices : 10
Persistence : Superblock is persistent
Update Time : Sun Jul 27 23:44:37 2014
State : clean, degraded, recovering
Active Devices : 8
Working Devices : 10
Failed Devices : 0
Spare Devices : 2
Layout : left-symmetric
Chunk Size : 512K
Rebuild Status : 98% complete
Name : charon:aggr1 (local to host charon)
UUID : 6d11820f:04847070:2725c434:9ee39718
Events : 11221
Number Major Minor RaidDevice State
0 8 129 0 active sync /dev/sdi1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
4 8 65 3 active sync /dev/sde1
5 8 17 4 active sync /dev/sdb1
6 8 113 5 spare rebuilding /dev/sdh1
9 8 81 6 active sync /dev/sdf1
8 8 161 7 active sync /dev/sdk1
7 8 145 8 active sync /dev/sdj1
10 8 97 - spare /dev/sdg1
|