Data recovery and disk repair questions and discussions related to old-fashioned SATA, SAS, SCSI, IDE, MFM hard drives - any type of storage device that has moving parts
Post a reply

Mysterious RAID 5 Repeated Drive Failure

August 9th, 2015, 17:46

Hello,

Here is the situation about which I am now panicking...

System:
Asus P8P67 mobo.
Intel P67 chipset RAID controller.
RAID 5, originally with 5 Samsung HD103SJ 1TB hdd's. One of the Samsung drives is now a Hitachi HDE72101.
Windows 7.

Ok, here goes:

Found computer at BIOS password indicating that it restarted on its own. Typed password, and found that computer would not boot because of failed hdd. Rebooted and used ctrl+I to enter Intel Rapid Storage Technology - Option ROM - 10.1.0.1008 (IRST). One hdd of the five had failed.

Ordered a new drive, same make & model. Replaced. Entered IRST and selected rebuild. Booted to windows. Opened Intel RAID application (forget what it's called in Windows), and sure enough, the rebuild was in progress.

The RAID volumes rebuilt, and all was well for a few days when:

The exact same thing happened. Same drive, same everything.

So I removed the drive and returned it to Amazon because I thought it was defective, and ordered a replacement.

When I put the new drive in (still a Samsung HD103SJ), it did the exact same thing: rebuilt in OS, then failed a few days later.

So now I'm thinking, "What, are all these drives defective?"

Next, I ordered a Hitachi HDE72101 because I read that they're more reliable, and they're cheaper than the older Samsungs anyway. Still 1TB.

It, too, rebuilt and worked for a few days, then failed.

I shut the computer down, and replaced the SATA cable. No luck. I switched the failed drive to a different power supply leg. No luck. I switched it to a different SATA port. No luck.

So, I decided that it was too unlikely that all these brand new drives were failing, and so it must be the power supply or motherboard, or something. Then, I did nothing for a few weeks.

Well, now this:

Found the computer at BIOS password. Uh oh. Ctrl+I for IRST utility. BAM! Two failed drives. The new Hitachi (which had been "failed" for a few weeks now), and another Samsung.

Do you want to rebuild? YES!

So that's where I'm at now. When I turn the computer on, and ctrl+I for IRST utility, it says both RAID volumes (0&1) are at rebuild status. However, it also says that all five drives are working fine. The five hdd's status are member disk (0,1).

As usual, at the bottom of the screen it says:

Volumes with rebuild status will be rebuilt within the operating system.

However, when I exit the utility, and try to boot up, the following happens.

I get a weird status bar across the bottom of the screen that says windows is loading files. This happens twice.

Then, it says loading Windows for a second or two, accompanied by a smaller status bar that goes back and forth.

Then an old looking Windows dialogue box with a wispy blue background and some green shoots with leaves and a white bird on the lower right comes up. The dialogue is titled System Recovery Options, and starts with select a language (which is greyed out), and select a keyboard input method (US is the only option).

When I click next, I am given two options.

First is:

Use recovery tools that can help fix problems starting Windows. Select an operating system to repair.

If your operating system isn't listed, click load drivers and install drivers for your hard disks.

Then, there's a box with nothing in it.

The second option is:

Restore your computer using a system image that you created earlier (this is the default option).

Neither of these options seems to get me anywhere.

If you can, please help me. My whole life is on this computer. All kinds of important documents, every photo I've taken over the past 20 years. I'm beside myself with stress, fear, and sadness.

Thank you in advance for any suggestions.

Peace,

Jim Morgan

Re: Mysterious RAID 5 Repeated Drive Failure

August 9th, 2015, 20:28

I would stop what you are doing and refrain from blindly replacing your drives. Instead take some time to determine how the drives are failing.

Do the failed drives spin up? If not, test the TVS diodes.

See http://www.users.on.net/~fzabkar/HDD/TVS_diode_FAQ.html

... and http://www.users.on.net/~fzabkar/HDD/ (photo clips)
http://www.users.on.net/~fzabkar/HDD/HD103SJ_TVS_2.jpg

Note that some versions of Intel's P67 chipset had a SATA bug. This bug affects the 3Gbps ports but not the 6Gbps ports.

http://www.anandtech.com/show/4143/the- ... t-sata-bug
http://www.anandtech.com/show/4142/inte ... ins-recall

If you have a desktop system with six SATA ports driven off of P67/H67 chipset, there’s a chance (at least 5%) that during normal use some of the 3Gbps ports will stop working over the course of 3 years. The longer you use the ports, the higher that percentage will be.

https://ata.wiki.kernel.org/index.php/K ... outhbridge

SATA link(s) performance may degrade over time on some B2 stepping Cougar Point PCH southbridge. Links will develop increased bit error rates and failed transfers have to be retried upon error detection by the SATA controller. As the wear out continues performance will get worse as the SATA controller will spend more time retrying failed transfers than it will spend on sending actual data. At some point things will get so bad that attached devices will be disconnected because of unreliable link from unstable clock and will not be detected at all.

Re: Mysterious RAID 5 Repeated Drive Failure

August 9th, 2015, 21:16

If starting to test the drives, it is a good idea to label your drives in a manner where it would prevent mixing. Test one by one.
For the future: never ever rebuild a RAID without backup, despite what manufacturers or whoever tells you it is ok to do so.

Re: Mysterious RAID5 Repeated Drive Failure

August 10th, 2015, 10:14

earlmyrtle wrote:My whole life is on this computer. All kinds of important documents, every photo I've taken over the past 20 years.

Considering this point, you indeed could have done too much without imaging the drives first, especially RAID5 rebuild with two failed drives.

BTW, before returning failed drives, did you try to test or clone them anyhow by connecting the drive to some PC directly? Corresponding images could be of use now.


If you want to go DIY way, image each drive and either work with images in read-only mode or go on with source drives, but be sure that you can restore the images properly.

Should you find yourself stuck at any step, I can connect and look into this recovery remotely.

Re: Mysterious RAID 5 Repeated Drive Failure

August 10th, 2015, 10:23

I think that you have done enough on your own to realize that you are over your head and if your data is as important as you say, it is best to seek professional assistance before further damage is caused. I, too, could help you remotely, but with the questionable condition of the drives, it could be risky working on the original drives without a backup clone. Remote recovery also requires direct access to the drives (original or clones) bypassing the RAID controller.

Feel free to give me a call, if you like.

Re: Mysterious RAID 5 Repeated Drive Failure

August 10th, 2015, 11:31

My suspicion in your case is actually the RAID controller. I would guess that it has a faulty port and it randomly disconnecting that one drive, possibly even if nothing is wrong with it.

However in any event, you now have two drives marked as offline so anything you try to do yourself is likely to result in data loss. You'll probably be looking at about $1500-$2000 for the recovery if you send it out. If you have a way to individually connect all the drives to another system and allow remote access via Teamviewer, can probably cut that down, though it's a good idea to clone all the drives first using ddrescue.
Post a reply