Yup. She’s down. No hardware faults that I can see, but the array claims that all the drives contain data from another system. (Settings–>Disks)
I didn’t have any active I/O to the array at the time of the fault, so I’m hoping that the RAID structures are consistent enough for repair. The storage recovery page is accessed by going to the hidden support page: https://<iomega_ip>/support.html and clicking Recover Disks.
On the Storage Recovery Verification page, click the check box and Apply.
The array displays a confirmation screen, but no further status from what I can see.
I waited a LONG time and couldn’t find any further indications. The array appears idle. Anyway, I’m not willing to give it up there. Surely the RAID structures aren’t that fragile…. I remember reading somewhere that the Iomega NAS are Linux-based, so let’s see if I can SSH into this thing!
Disclaimer: If you follow steps after this line you may render your device unusable. Buyer beware; your mileage may vary; at your own risk; etc.
I came across this excellent article by Christopher Kusek (PKGuild): Shell access to your ix2/ix4 exposed! “Get yer red hot ssh here!” If you want to hack it yourself, read the post. I followed all of the steps including the John-the-ripper brute force shadow attack. Then I saw at the end of the post the password is always soho+admin_password…. That’s the second time in my life that I learned to read the whole assignment before starting. 🙂
So, back on the support screen, click Support Access.
Check the box to Allow remote access for support and click Apply.
Now the fun can begin!!! SSH into the NAS as root. Remember, the password is soho+admin_password. So, if your admin password is “haxor”, the root password will be “sohohaxor”.
Jasons-MacBook-Pro:~ thornj$ ssh root@192.168.0.254 root@192.168.0.254's password: root@storage:
It looks like the Iomega uses a standard Linux RAID implementation. So it should be recoverable using standard Linux tools. You can find more detailed information about this on raid.wiki.kernel.org.
So, first to check the array health.
root@storage:/# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid5 sdd2[4](S) sdc2[5](S) sda2[0] sdb2[1]
5854422528 blocks level 5, 64k chunk, algorithm 2 [4/2] [UU__]
md0 : active raid1 sdd1[3] sdc1[2] sda1[0] sdb1[1]
2040128 blocks [4/4] [UUUU]
unused devices: <none>
root@storage:/#
The device md1 is my RAID 5 array. The device md0 is the RAID 0 array that contains the IX-200D kernel. Right away it’s not looking good. Note the “UU__”. There are two up devices and two down devices on md1. All four devices are up on md0 (“UUUU”).
Next to look at the status of md1.
root@storage:/# mdadm --detail /dev/md1 /dev/md1: Version : 00.90 Creation Time : Sat Nov 8 18:15:18 2014 Raid Level : raid5 Array Size : 5854422528 (5583.21 GiB 5994.93 GB) Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Persistence : Superblock is persistent
Update Time : Thu Nov 27 20:33:21 2014
State : clean, degraded
Active Devices : 2
Working Devices : 4
Failed Devices : 0
Spare Devices : 2
Layout : left-symmetric
Chunk Size : 64K
UUID : bb9047c5:f742e3ce:2e29483d:f114274d (local to host storage)
Events : 0.22892
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
2 0 0 2 removed
3 0 0 3 removed
4 8 50 – spare /dev/sdd2
5 8 34 – spare /dev/sdc2
root@storage:/#
OUCH! It appears that drives 2 and 3 faulted. Before I make any changes, I’ll save the superblock info. I can come back to this later for reference if I make mistakes reassembling the array.
root@storage:/# mdadm --examine /dev/sd[abcd]2 > raid.status root@storage:/#
I’ll check the event IDs are similar to determine how much data corruption I am likely to encounter.
root@storage:/# mdadm --examine /dev/sd[a-d]2 | egrep 'Event|/dev/sd' /dev/sda2: Events : 22894 this 0 8 2 0 active sync /dev/sda2 0 0 8 2 0 active sync /dev/sda2 1 1 8 18 1 active sync /dev/sdb2 4 4 8 50 4 spare /dev/sdd2 5 5 8 34 5 spare /dev/sdc2 /dev/sdb2: Events : 22894 this 1 8 18 1 active sync /dev/sdb2 0 0 8 2 0 active sync /dev/sda2 1 1 8 18 1 active sync /dev/sdb2 4 4 8 50 4 spare /dev/sdd2 5 5 8 34 5 spare /dev/sdc2 /dev/sdc2: Events : 22894 this 5 8 34 5 spare /dev/sdc2 0 0 8 2 0 active sync /dev/sda2 1 1 8 18 1 active sync /dev/sdb2 4 4 8 50 4 spare /dev/sdd2 5 5 8 34 5 spare /dev/sdc2 /dev/sdd2: Events : 22894 this 4 8 50 4 spare /dev/sdd2 0 0 8 2 0 active sync /dev/sda2 1 1 8 18 1 active sync /dev/sdb2 4 4 8 50 4 spare /dev/sdd2 5 5 8 34 5 spare /dev/sdc2 root@storage:/#
The event IDs are identical, so the array should safe to reconstruct without any data loss.
First, I’ll try to re-add the drives.
root@storage:/# mdadm /dev/md1 --re-add /dev/sd[cd]2 mdadm: Cannot open /dev/sdc2: Device or resource busy root@storage:/#
Next, I’ll try to forcibly reassemble the array.
root@storage:/# mdadm --assemble --force /dev/md1 /dev/sd[a-d]2 mdadm: device /dev/md1 already active - cannot assemble it root@storage:/#
Last, I’ll try to recreate the array. Before I do that, I need to find the drive size.
root@storage:/# grep Used raid.status Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) root@storage:/#
Now to create the array:
root@storage:/# mdadm --stop /dev/md1 mdadm: stopped /dev/md1 root@storage:/# mdadm --remove /dev/md1 root@storage:/# mdadm --create --assume-clean --level=5 --raid-devices=4 --size=1951474176 /dev/md1 /dev/sd[a-d]2 mdadm: /dev/sda2 appears to be part of a raid array: level=raid5 devices=4 ctime=Sat Nov 8 18:15:18 2014 mdadm: /dev/sdb2 appears to be part of a raid array: level=raid5 devices=4 ctime=Sat Nov 8 18:15:18 2014 mdadm: /dev/sdc2 appears to be part of a raid array: level=raid5 devices=4 ctime=Sat Nov 8 18:15:18 2014 mdadm: /dev/sdd2 appears to be part of a raid array: level=raid5 devices=4 ctime=Sat Nov 8 18:15:18 2014 Continue creating array? y mdadm: array /dev/md1 started. root@storage:/#
Now to check if the array is created.
root@storage:/# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1] 5854422528 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 2040128 blocks [4/4] [UUUU] unused devices: <none> root@storage:/#
It looks good, but the UI doesn’t see any change. Reboot so that the NAS inits properly.
root@storage:/# reboot
Success! All drives and RAID volumes look fine. The reconstruction is running again. With a little luck it will actually complete.
The UI remained stuck at 0% reconstructed, but the CLI never showed a reconstruct. Then I remembered that the reconstruct of /dev/sdd2 may never have completed. So I decided to force a fault on /dev/sdd2 to trigger a reconstruct.
root@storage:/# mdadm --manage --set-faulty /dev/md1 /dev/sdd2 mdadm: set /dev/sdd2 faulty in /dev/md1 root@storage:/#
Check the array status by looking at /proc/mdstat. The device is showing as faulted, but reconstruction has not started.
root@storage:/# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid5 sda2[0] sdd2[4](F) sdc2[2] sdb2[1] 5854422528 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_] md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 2040128 blocks [4/4] [UUUU] unused devices: <none> root@storage:/#
Check the array status again with mdadm. The device is showing as faulted. The reconstruction has started automatically.
root@storage:/# mdadm -D /dev/md1 /dev/md1: Version : 00.90 Creation Time : Thu Nov 27 21:33:13 2014 Raid Level : raid5 Array Size : 5854422528 (5583.21 GiB 5994.93 GB) Used Dev Size : 1951474176 (1861.07 GiB 1998.31 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 1 Persistence : Superblock is persistent
Update Time : Fri Nov 28 08:35:28 2014
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 0% complete
UUID : c0c74b41:ea6d9b00:2e29483d:f114274d (local to host storage)
Events : 0.20
Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
2 8 34 2 active sync /dev/sdc2
4 8 50 3 spare rebuilding /dev/sdd2
root@storage:/#
Continue to monitor status until reconstruction reaches 1%.
root@storage:/# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid5 sdd2[4] sda2[0] sdc2[2] sdb2[1] 5854422528 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_] [>....................] recovery = 1.0% (19524352/1951474176) finish=962.5min speed=33452K/sec md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 2040128 blocks [4/4] [UUUU] unused devices: <none> root@storage:/#
At 1% progess, check the UI to see if it’s tracking.
Reconstruct is running! In 16 hours it should be complete.