MultiDrive – free backup, clone & wipe disk utility from Atola Technology

All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: two disks with similar symptoms - maybe the computer case?
PostPosted: November 15th, 2013, 7:58 
Offline

Joined: November 15th, 2013, 7:38
Posts: 3
Location: Europe
Hi!

I've two harddisks, which show interesting (to me) symptoms, and I wonder if any of the experts here might have an idea what could be the cause. There is no risk of actual data loss (just time and effort to replace the disks when they are dead).

In any case, here's what happened - we mailed a HDS722020ALA330 (hitachi, 3TB) to be put into a server as secondary disk. About a week after it started to be used, it got very slow (it was reading ~2mb/s and writing 500kb/s - obviously something was very wrong with it), and it acquired hundreds of bad sectors each day.

Ok, shit happens, so we replaced it with a WDC WD30EZRS-00J99B0. After about two weeks, it acquired the same symptoms (1-2mb/s read, 128-500kb/s write), with one difference - the number of reallocated sectors didn't increase if the sectors were repaired (more on that soon).

I asked a technician at the location to see if something was pressing on the disk or something else was amiss, as I didn't want to believe that two disks would go down similarly in the same location. Technician said they removed the disk, cleaned the case, and saw nothing out of the ordinary.

And lo and behold, the disk worked fine for two weeks, after which it returned to it's low performance state.

Since I didn't want to prepare yet another disk (takes a week), and 2mb/s were fast enough at the time, I decided to watch it go dead. For this, I wrote a script that reads smart vendor log 0xc0 (available on request in the unlikely case somebody wants it), which seemed to contain bad blocks before they were shown as pending by smart. The script simply tries to read each block multiple times, and when it succeeds, it just writes it again. This has had a 100% chance of resurrecting the data. The bad blocks are almost always near the same LBA (4600000000 +- 100000000), and usually need 1-2 retries, sometimes up to 12, to read successfully.

This had two effects. First, the "current pending sector" goes down by one for every repaired sector (4, 3, 2, 1, 0, 65535, 65534...) even when it didn't increment yet, and second, the disk is in this state for 1.5 years now, and shows no other signs of degradation (which I find amazing).

So, my question is - has anybody any idea on how you cna get a disk into a state where it reads and writes extremely slowly, gets a lot of problematic sectors (which turn out to be fine afterall)? I am not an expert at all - the only thing I can imagine is that something adds pressure to the disk case so that it has trouble spinning, or some similar mechanical problem. Maybe one of you has an idea whats going on?

(As a last note, we got the hitachi disk rma'd and it's now used in the same case as primary disk, without any issues whatsoever. I suspect the WD is simply more picky before actually reallocating a good sector).


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 15th, 2013, 15:18 
Offline

Joined: July 2nd, 2011, 14:16
Posts: 463
Location: England
Does the drive and the components heat up while its always on, perhaps there is not enough ventilation.

Another idea is to try the disk in a external caddy and run tests and such to rule out the server or whatever you are using.

Shane


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 15th, 2013, 19:23 
Offline
User avatar

Joined: December 4th, 2012, 1:35
Posts: 3903
Location: Adelaide, Australia
Are these Disks consumer/Desktop disks or are they server Disks. I am not "into" Hard drives, but when I replaced disks in our HP ML350 Proliant Server, some of the specs were specified like "24/7, always on" Drives, suit server. Granted though regardless of that, it is a short timeframe for issues. I am wondering if there are any other factors such as possibly database access from some kind of app, some backup operation or logging.

The pressure to the case theory I personally don't think so. If you take into account the extremely small tolerences, and think about how much you could push it before you damage the disk in a big way, then you would need to be in between that situation and no pressure.. If you can see where I am coming from.

Does your server have any management software or utilities that may need patching, or are intrusive to the hardware?

One of my new servers is a Dell PowerEdge T320 and it has very detailed utilities snaking its way thru the system


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 15th, 2013, 21:27 
Offline

Joined: November 15th, 2013, 7:38
Posts: 3
Location: Europe
The disk reports 26° via smart now, and the logs indicate it was never above 30, even in summer (the other disk in the case reports 33°, which is more reasonable).

They are normal desktop disks (both were 4k block/3tb). I have lots of these disks at home (8 hitachi, 6 wd, 6 seagate), and they work just fine here. I also maintain a number of other servers with similar disks, and disk problems are extremely rare (and certainly lead to disk death, not 18 months sustained slow operation).

The board is a supermicro board, and I don't have any accessible management interface on it. The problem is thta I can't experiment except from the software side (I can only send disks to the datacenter for replacement).

The hitachi disk did have _very_ heavy database load (constant seeking), but I moved this load to the primary disk (now the rma'd hitachi) when it got slow, and the primary disk has not developed any issues so far. The wd disk now in is only used to serve large files (movies) via http.

The reason I suspect *some* mechanical problem is that the disks worked fine when the case was opened before and the primary disk works fine. It could just as well be bad voltages though, but then only for that sata power plug. I am pretty convinced it isn't a surface defect (the sectors are always readable and repairable).

The puzzling thing is that the WD disk just jugs along with it, detecting bad/unreadable sectors every day, that get repaired again and again, and the disk overall just reads and writes very slowly, but doesn't seem to break long-term. Haven't seen a disk do that yet.

Anyways, thank you for your input so far. It seems to be no obvious/well known issue at least :)


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 16th, 2013, 16:04 
Offline

Joined: July 2nd, 2011, 14:16
Posts: 463
Location: England
I think there might be something wrong with your main board that runs the joint. Probably fault in some way that is misreading data sent from the drives or a chip overheating or just simply faulty. You could also have capacitors in the power supply on their way out too.

You might want to test the drives in a independent system and then look over the server setup in detail.

Shane


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 16th, 2013, 16:44 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
The symptoms sound like the drives are running in PIO mode.

Could we see a HD Tune read benchmark graph?

On the WD drives, try reducing the SATA link rate via the OPT1 jumper.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 16th, 2013, 21:39 
Offline

Joined: November 15th, 2013, 7:38
Posts: 3
Location: Europe
No, the symptoms (phantom bad sectors) cannot be caused by PIO mode (or similar mismatch). (Also, Linux cannot do PIO in AHCI mode).

I can't do hdtune because I can't install windows on it, and it would take slightly less than a month to do a full read :)


Top
 Profile  
 
 Post subject: Re: two disks with similar symptoms - maybe the computer cas
PostPosted: November 16th, 2013, 22:03 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
This is what PIO mode looks like (when the controller is configured for IDE mode):
http://community.wd.com/t5/image/server ... bl-1&px=-1

I understand that you are using AHCI mode, but a benchmark graph may still be informative. BTW, the test executes relatively quickly. In any case, you could shortstroke the drive to 1GB or less.

As for Linux GUI benchmark tools, perhaps the following guide might be of use:
http://askubuntu.com/questions/87035/ho ... erformance

_________________
A backup a day keeps DR away.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 79 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group