MultiDrive – free backup, clone & wipe disk utility from Atola Technology

All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: data recovery for RAID
PostPosted: July 28th, 2012, 8:00 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
In the middle of a rebuilding process, one of the drives in my RAID 6 array decided to break down and take the entire array with it. The drive is encountering multiple SMART errors. Please see pictures. Any ideas on how to remedy it? The drive is a Western Digital RE4-GP 2GB 2002FYPS drive with firmware 04.05G05


Attachments:
results.rtf [4.21 KiB]
Downloaded 563 times
results.png
results.png [ 27.83 KiB | Viewed 8692 times ]
HDTune_Health_WDC_WD2002FYPS-01U1B0.png
HDTune_Health_WDC_WD2002FYPS-01U1B0.png [ 49.41 KiB | Viewed 8692 times ]
Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: July 28th, 2012, 10:55 
Offline

Joined: August 18th, 2010, 17:35
Posts: 3669
Location: Massachusetts, USA
Replace with a brand new one. That drive is no longer reliable, especially in a RAID config.

Strongly recommend that the other drives be checked as sometimes they fail in bunches.

_________________
Hard Disk Drive (HDD), Solid State Drive (SSD, SATA, NVMe, etc), USB Flash Drive and RAID Data Recovery Specialist in Massachusetts


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: July 28th, 2012, 15:21 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
labtech wrote:
Replace with a brand new one. That drive is no longer reliable, especially in a RAID config.

Strongly recommend that the other drives be checked as sometimes they fail in bunches.

already ordered replacement... but as i said the drive took down the array already. cant afford to just throw it away


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: July 28th, 2012, 16:17 
Offline

Joined: May 6th, 2008, 22:53
Posts: 2138
Location: England
I'm not promising to fix this for you remotely, but FYI your description so far doesn't really make sense to me, due to missing details. You say this is RAID 6 and it was "rebuilding". Why? One drive already faulty? Or just some bad blocks on one drive but your RAID controller "failed" it anyway? Or something else?

IMHO more detail is needed of the history of your current situation, including full configuration details with type of RAID controller / software RAID etc. as well as descriptions of all the tests and changes you have already done, the results of each of those steps, and any relevant error messages or logs from your RAID controller / OS (if software RAID).

If it really is a RAID 6 volume, even if a second drive has a problem (which I guess you're saying is that drive you are now describing with that SMART data) and it is "failed" (or "evicted") by the RAID controller / OS, you would not lose access to the data in that RAID volume since RAID 6 can withstand two drive failures, and still allow data access.

Therefore perhaps it wasn't really operating as RAID 6 (which can happen in at least a couple of different ways), or you have missed some vital info in your description to explain why two drive problems have resulted in loss of access to data. With full & detailed info from you, then IMHO there's a better chance of you getting appropriate advice - otherwise reader's guesses of those missing details might mean that you get wrong advice (or none at all).

As always, any DIY recovery attempts carry risks, especially for the inexperienced... You can potentially change a recoverable situation into an unrecoverable one. Consider this, if the data is valuable.


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: July 29th, 2012, 12:31 
Offline

Joined: August 18th, 2010, 17:35
Posts: 3669
Location: Massachusetts, USA
Sorry, I misunderstood based on assumptions due to lack of details.

RAID 6, up to 2 drives can fail and the array will still be online. When the third fails, then it is offline.

So, I figured if you said one drive, then this "one drive" was the first failed drive, thus initiating the rebuilding process.

So, as Vulcan mentioned, a whole description is necessary along with a clear indication of what your goal is.

_________________
Hard Disk Drive (HDD), Solid State Drive (SSD, SATA, NVMe, etc), USB Flash Drive and RAID Data Recovery Specialist in Massachusetts


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: July 31st, 2012, 5:45 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
Sorry guys, I should have been more clear in my original post. I did not have all the information back then. Actually, I still don't have all the information, but I have a much better picture.

Here's my setup:

OS: Windows Server 2008 R2

HDD: Western Digital 2002FYPS

Raid Controller: 3ware 9650SE 24ML

Array: RAID 6 (14 HDDs)

Here are the error logs from my RAID controller.
https://dl.dropbox.com/u/10737837/lsi.Win2k8R2.HOMESERVER.072412.10704.zip
https://dl.dropbox.com/u/10737837/errorlog_0.dat


A few nights ago, the array reported that there was a problem with two of the drives (drives 0 and 5). I'm not sure what the exact error was. I was in the middle of a relatively large transfer (~50GB). All of a sudden, my system froze for about 2 hours, and after that, I managed to do a normal restart. The controller does this sometimes - kicks out two drives randomly and then proceeds to rebuild them.

When the system started, I checked 3DM2 (the raid controller software). It said that the array was degraded and proceeded to automatically rebuild the array. Everything was fine until the rebuild process hit 14%. Then I received several errors concerning drive 3 (which is strange, because the two drives that were dropped are 0 and 5). While the rebuilding process is still listed as "active" under 3DM2, it hasn't progressed at all overnight.

I took drive 3 out of the array and ran the WD diagnostics software on it:
Image



I also tried reading the SMART data via HD Tune:
Image


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 4:01 
Offline
User avatar

Joined: January 28th, 2009, 10:54
Posts: 3547
Location: Greece
So you have 3 failed drives (rare for FYPS i might add, these are pretty good drives).

Since your 3rd drive still detects I say you make a clone of it using some non-windows utility like ddrescue and then go from there.

_________________
http://www.northwind.gr
SandForce SSD Recovery
Ransomware Reverse Engineering - NoMoreRansom! partners


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 4:10 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
northwind wrote:
So you have 3 failed drives (rare for FYPS i might add, these are pretty good drives).

Since your 3rd drive still detects I say you make a clone of it using some non-windows utility like ddrescue and then go from there.



Well, actually drives 0 and 5 have not really "failed". The array just refuses to accept them for some reason.

It's drive 3 that's failed.


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 5:31 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
northwind wrote:
So you have 3 failed drives (rare for FYPS i might add, these are pretty good drives).


Actually, I only recently found out that 2002FYPS, the so-called "green" drives with variable RPM, is dangerous when used in RAID configurations despite their "enterprise" label. That's they WD came out with 2003FYPS with a constant RPM.



Variable RPM -> timing issues -> drives been dropped from the array.

I think that's what happened with drives 0 and 5. That why I think there's no actual physical damages to them.



Quote:
Since your 3rd drive still detects I say you make a clone of it using some non-windows utility like ddrescue and then go from there.


why ddrescue? I'm not familiar with this particular piece of software. do you have any guides you could link to? Thanks a lot!


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 7:51 
Offline
User avatar

Joined: January 28th, 2009, 10:54
Posts: 3547
Location: Greece
hdd0 wrote:
northwind wrote:
So you have 3 failed drives (rare for FYPS i might add, these are pretty good drives).


Actually, I only recently found out that 2002FYPS, the so-called "green" drives with variable RPM, is dangerous when used in RAID configurations despite their "enterprise" label. That's they WD came out with 2003FYPS with a constant RPM.


Hmmm... maybe you're right on this one.

hdd0 wrote:
Quote:
Since your 3rd drive still detects I say you make a clone of it using some non-windows utility like ddrescue and then go from there.


why ddrescue? I'm not familiar with this particular piece of software. do you have any guides you could link to? Thanks a lot!


ddrescue has a lot of good features that will help during imaging. Use search function for ddrescue on this forum and you will come up with a lot of good information.
Search posts from member Vulcan who has posted a lot of useful information about this. Also check the manual: http://www.gnu.org/software/ddrescue/ma ... anual.html

Good luck.

_________________
http://www.northwind.gr
SandForce SSD Recovery
Ransomware Reverse Engineering - NoMoreRansom! partners


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 9:28 
Offline

Joined: August 18th, 2010, 17:35
Posts: 3669
Location: Massachusetts, USA
hdd0 wrote:
why ddrescue? I'm not familiar with this particular piece of software. do you have any guides you could link to? Thanks a lot!


ddrescue is free and pretty good for what was designed to do as long as one understands how to use it.

But there are other options, however mostly not free.

_________________
Hard Disk Drive (HDD), Solid State Drive (SSD, SATA, NVMe, etc), USB Flash Drive and RAID Data Recovery Specialist in Massachusetts


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 1st, 2012, 9:34 
Offline

Joined: May 6th, 2008, 22:53
Posts: 2138
Location: England
@hdd0,

I don't want to interrupt your converstion with northwind - I'll just add some comments, all IMHO...

Not all details of the RAID configuration are clear to me - e.g. is there a spare drive (or more than one) included in your count of 14. I couldn't see that info in the logs (on a quick review). I'm also specifically not giving advice on what you should do next - remote RAID recovery is not always easy on RAID arrays which I do know. On those which I don't know well (like 3Ware), the risks are too high that I'll miss something important, and I don't have the (many hours of) time needed to read & understand every part of your logs. However some general principles apply, hence these comments.

Unfortunately I've seen there are several issues with this configuration and sys admin, which led to getting into this situation. :( A deep analysis could be done on the log files (I've just skimmed some of them) to try to get a better understanding, but I found indications that the drive configuraton changed between mid-April (when the sick drive seemed to be connected to port 19) and early-May (sick drive is then connected to port 3), so there appears to be even more history to this!

If you decide to take the (IMHO significant) risks of DIY, then as already suggested to you, starting with cloning the failing "drive 3" (using a non-RAID controller, and preferably a non-Windows OS) is one option. There are likely to be several hundred (e.g. 477 or more) unreadable sectors on that disk, and that's assuming it doesn't fail more catastrophically during your attempts. If it doesn't fail during cloning, then at least you've frozen the situation with that drive before it gets even worse. IMHO you need to use a cloning tool which will record where the unreadable sectors originally were - you may need that info later (see below).

To reduce the risks even more of not being able to get back to this current state, then cloning all the disks is generally a good option - yes, you'd need another 14 x 2TB disks (or equivalent filesystem space). You can start to see why RAID recovery is costly. :(

My personal choice of DIY cloning software, after considering & accepting the risks, is GNU ddrescue for several reasons, including its control over the cloning process & retries. However it relies on having some Linux/Unix sys admin knowledge, understanding of device naming, identification of specific devices (disks), the difference between a device node and a filesystem, concepts of "mounting" etc. etc. Without that knowledge, then ddrescue can be high risk, and we've seen some people make mistakes when using it & similar software. Other cloning software exists (other members have recommended DMDE, HDClone and more in the past). Ultimately, if you decide to do DIY, you would have to choose what suits your experience, skills, budget, expectations of support etc. etc. - we can't decide that for you, as there is no "one size fits all" solution.

I can't see in the logs when or why drives 0 & 5 (as you mentioned) were "failed" by the RAID controller for what may be a spurious reason (especially if they were "failed" at the same time). However it seems to have been like that for a while, since I can't see recent indications of that event. If I'm correct (that those 2 drives were "failed" a long time ago), then the data on those disks may be of limited value, due to being "stale" i.e. not keep updated with the rest of the RAID volume. It depends how much updating has been done of the files that you want to recover most.

If your important files have been updated since those 2 drives were "failed", that probably means your recovery options are limited to trying to use the remaining readable drives (11 of them?), plus the flaky "drive 3" or its clone, and ignoring the two drives which were "failed" some time ago (drives 0 & 5 - again, assuming that my understanding of their history is correct). However if your important files have not been changed since those 2 drives were "failed", then that may open the possibility of using the data on one or both of those disks, in combination with the others except drive 3, to recover those files.

It is possible that a good DR company might manage to read more data from "drive 3" than you can do - which could translate into more data recovered.

At the moment, according to its logs, your RAID controller seems to be trying to rebuild the volume by reading from flaky "drive 3" (and, presumably, writing (i.e. rebuilding) onto one or more other drives - perhaps drives 0 and/or 5). It then stops the rebuild (and from your comments, prevents the user data on the RAID volume being accessible) when a read error or timeout occurs on "drive 3". An attempt at (partial) data recovery would need to prevent that RAID controller behaviour - there can be a few ways to do that (e.g. using specialist RAID recovery software) but some techniques would need the disks to all be accessible as individual drives. I have seen people try to modify a RAID configuration to achieve that, and cause yet more problems. Therefore new/different HBAs may be needed, which is more cost & risk for you, dealing with this unfamiliar situation.

Although I work with large RAID systems in my day job (which isn't DR), your situation seems to have a significant risk of you potentially losing some/all your data - and at best, it's likely that some is already unrecoverable, due to the unreadable sectors on "drive 3" which you are now relying on. If you clone "drive 3" and it has any unreadable sectors (likely), and then you read data from the (then degraded) RAID volume by using that clone, you will get corrupted data in files using those parts of the volume where it needs the data from the previously-unreadable parts of the original "drive 3". In my experience, it can sometimes be difficult and complex to try to find which LBAs in the RAID volume (and hence to later calculate which actual files) have been corrupted in that situation. This is where you (or the relevant expert) would need to know exactly which blocks on "drive 3" were originally unreadable, to perform those calculations.

I hope you can now see my concerns about the risks of a DIY approach, given the current situation. You may want to consider contacting member DR-Kiev who has helped other people to do remote RAID recovery (for a fee). I am not promising that they will accept the job, or that they can recover all your data - but they do have experience. Of course finding a suitable experienced local DR company is another option. Good luck with whatever you decide :)


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 2nd, 2012, 6:05 
Offline

Joined: July 28th, 2012, 7:57
Posts: 6
Location: United States
Here's a new dump of logs form the controller, if anyone cares to glance over them. I've been going over it but have not been able to glean too many useful facts

https://dl.dropbox.com/u/10737837/dumpdcb.zip


Top
 Profile  
 
 Post subject: Re: data recovery for RAID
PostPosted: August 2nd, 2012, 10:47 
Offline
User avatar

Joined: February 9th, 2009, 16:13
Posts: 2575
Location: Ontario, Canada
If your data is valuable, get it recovered by a pro while it is still recoverable. If you cannot afford a pro right now, remove all the drives and set them aside until such time that you can.

If the data ins't of any value, replace the bad drives, hope that the other drives are healthy and build a new RAID array...and setup a solid backup routine this time.

_________________
Luke
Recovery Force Data Recovery


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 49 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group