Vulcan wrote:
There are several parts which I don't agree with, in your analysis & comments, even before reaching the questions in your posting.

There are also some pieces of missing data, which would definitely help to confirm or deny some possibilities - some of the missing data is gone forever (e.g. previous "problem" drives), while some other useful data might still be available from you.
I'll try to give the missing data
Vulcan wrote:
I don't know whether it is worth my time to try to assist further, because many first time posters here disappear when they don't get the response that they want or expect.
Well I can't promise I will be still here in 12 months (HDD is not my hobby) but I will reply to the questions until we can consider the topic is closed
Vulcan wrote:
Bernd wrote:
What does mean these slowdowns during the scan but with no error? Are some sectors "weak" but still readable after some retries?
Yes, that's possible.
Ok, it's what I thought
Vulcan wrote:
It would be useful to see similar benchmark graphs from your previous "problem" disks, but you don't mention that they are available.
I'm sure the Samsung F1 HD322HJ 320 GB, the Seagate ST3500418AS 500 GB, the Samsung F3 HD502HJ 500 GB had normal benchmark / surface scan when I installed them in the computer.
I still have the F1 somewhere (not in the PC). I don't remember if I benchmarked it after the issue appeared but the curve had very probably the dips because it had bad blocks (red in hd tune) and I could see the test slowing down a lot when on the bad blocks. I name hd tune but I tested other tools and results were similar.
About the Seagate, I don't remember if I benchmarked it. I only remember surface scans reported no bad block. No data corruption/crash either. The SMART reallocated sector count was the only issue. I don't have it yet, I gave it back to the store where I bought it and they sent it to Seagate. After more than 2 months with no news from Seagate they accepted to replace it by a Western Digital Caviar Blue. It's my current drive.
Vulcan wrote:
It would also be a good idea for you to repeat this test several times on the current disk, and see whether the "dips" in the benchmark graph occur at the same parts of the disk each time, to confirm the disk as the likely cause of those "dips".
As explained above my current system disk is a WD. But my latest problem disk, the Samsung F3 HD502HJ is still in the box. The results are consistent, the dips are always at the same part, like the capture I posted in my first post.
SMART data when I installed it (2010/10/28):
Code:
ID Current Worst ThresholdData Status
(01) Raw Read Error Rate 100 100 51 0 ok
(02) Throughput Performance 252 252 0 0 ok
(03) Spin Up Time 83 82 25 5452 ok
(04) Start/Stop Count 100 100 0 8 ok
(05) Reallocated Sector Count 252 252 10 0 ok
(07) Seek Error Rate 252 252 51 0 ok
(08) Seek Time Performance 252 252 15 0 ok
(09) Power On Hours Count 100 100 0 7 ok
(0A) Spin Retry Count 252 252 51 0 ok
(0B) Calibration Retry Count 252 252 0 0 ok
(0C) Power Cycle Count 100 100 0 8 ok
(BF) G-sense Error Rate 252 252 0 0 ok
(C0) Unsafe Shutdown Count 252 252 0 0 ok
(C2) Temperature 64 64 0 917523 ok
(C3) Hardware ECC Recovered 100 100 0 0 ok
(C4) Reallocated Event Count 252 252 0 0 ok
(C5) Current Pending Sector 252 252 0 0 ok
(C6) Offline Uncorrectable 252 252 0 0 ok
(C7) Ultra DMA CRC Error Count 200 200 0 0 ok
(C8) Write Error Rate 100 100 0 0 ok
(DF) Load/Unload Retry Count 252 252 0 0 ok
(E1) Load/Unload Cycle Count 100 100 0 8 ok
SMART data today:
Code:
Model : SAMSUNG HD502HJ
Firmware : 1AJ10001
Serial Number : S20BJ9BZ725612
Disk Size : 500.1 GB (8.4/137.4/500.1)
Buffer Size : 16384 KB
Queue Depth : 32
# of Sectors : 976773168
Rotation Rate : 7200 RPM
Interface : Serial ATA
Major Version : ATA8-ACS
Minor Version : ATA8-ACS version 6
Transfer Mode : SATA/300
Power On Hours : 1274 hours
Power On Count : 504 count
Temparature : 25 C (77 F)
Health Status : Good
Features : S.M.A.R.T., APM, AAM, 48bit LBA, NCQ
APM Level : 0000h [OFF]
AAM Level : FE00h [OFF]
-- S.M.A.R.T. --------------------------------------------------------------
ID Cur Wor Thr RawValues(6) Attribute Name
01 100 100 _51 000000000005 Read Error Rate
02 252 252 __0 000000000000 Throughput Performance
03 _82 _82 _25 00000000155E Spin-Up Time
04 100 100 __0 0000000002E5 Start/Stop Count
05 252 252 _10 000000000000 Reallocated Sectors Count
07 252 252 _51 000000000000 Seek Error Rate
08 252 252 _15 000000000000 Seek Time Performance
09 100 100 __0 0000000004FA Power-On Hours
0A 252 252 _51 000000000000 Spin Retry Count
0B 252 252 __0 000000000000 Recalibration Retries
0C 100 100 __0 0000000001F8 Power Cycle Count
BF 252 252 __0 000000000000 G-Sense Error Rate
C0 252 252 __0 000000000000 Power-off Retract Count
C2 _64 _64 __0 001B000A0019 Temperature
C3 100 100 __0 000000000000 Hardware ECC recovered
C4 252 252 __0 000000000000 Reallocation Event Count
C5 252 252 __0 000000000000 Current Pending Sector Count
C6 252 252 __0 000000000000 Uncorrectable Sector Count
C7 200 200 __0 000000000000 UltraDMA CRC Error Count
C8 100 100 __0 000000000000 Write Error Rate
DF 252 252 __0 000000000000 Load/Unload Retry Count
E1 100 100 __0 0000000002E8 Load/Unload Cycle Count
The only only interesting data (to my eyes) is the read error rate. But I don't know exactly how to interprete it.
Vulcan wrote:
It is interesting (but inconclusive at this stage), that the dips in the benchmark graphic which you included, are all towards the outside (OD) of the drive. Again, it would be good to compare this pattern with other "problem" drives.
The bad blocks on the Samsung F1 were in the same zone: In my sytem partition (50 GB), 2 groups of bad blocks.
Maybe something interesting to note:
-I noticed the F1 had a problem because it crashed during the boot. After investigations I found out that the registry was damaged. So it seems the registry was stored on the damaged blocks.
-I noticed the F3 had a problem because calc refused to launch (invalid win32 application). I checked the event log and saw that among many errors like "can't launch service", the registry was damaged and that windows switched to the registry backup.
The registry is problably the most read/written file of windows, so probably the sectors where it's stored are the most used on the disk. Moreover I don't need a lot of space of my disks: I have a 50 GB partition system and the rest is a data partition. And on this partition I don't have a lot data (I think I never had more than 100 GB on it) and since I defragment my disks most of the files are at the beginning of the partition. So the zone with no problem probably contains no file. Maybe there's a wear of the most used sectors, I don't know... Just a supposition, I don't know how/why a sector becomes damaged.
Vulcan wrote:
It would also be a good idea to watch for changes over time in that benchmark graph and in the SMART data, so you should be collecting and keeping those results frequently from now onwards, for later analysis.
I took some copies of the benchmark result and of the SMART data for the WD I currently use.
Vulcan wrote:
Bernd wrote:
And maybe sometimes the drive doesn't correctly read or write these sectors and makes some files corrupted?
That's unlikely, for the large amount of corruption you have reported - large scale data corruption
tends to have other causes, in my experience. Having said that, mis-correction when using ECC is always a (very very very small) mathmatical possibility.
Does it have something to do with the SMART raw read error rate? I don't get very well it's meaning.
Vulcan wrote:
IMHO you would need to do much more testing (= time and perhaps money), to make progress on investigating the cause of this corruption e.g. using a different disk drive for real data, and dedicating that current Samsung disk drive to further investigation work.
That's why my system is now on the new Western Digital. The Samsung is plugged to the MB but disabled. I can easily enable it for investigation
Vulcan wrote:
Also, although you seem to be correcting for corruption in the Windows files, I did not see any mention by you of finding or testing for corruption in your data files. That information would be very useful.
Well I used sfc /verifyonly because it's an easy way to identify corrupted system files because windows knows the original hashes. For other files files it's less easy. I know that 2 or 3 jpegs in my pictures folder were damaged, that one day I tried to launch an audio editor but 3 or 4 files were damaged (exe, dll...). And one part of a big thunderbird file containing my emails. Unless I open and verify all my files, documents, softwares, it's not easy to find out what is corrupted. Same thing for my data partition: it mostly contain music, videos, ISOs. And copies on my previous damages disks lol. I found no corrupted file on this data partition but I didn't have the patience to watch all the videos...
Vulcan wrote:
Bernd wrote:
Is it possible that the magnetic field generated by the PSU damages the plate of the HDD?
Interesting theory - disks do have a limit to the size of external magnetic field which they are expected to cope with, before it affects normal operation. You would need to use a gaussmeter in order to further investigate that theory directly; or else you could consider a different approach e.g. proactively changing that PSU for something different, which was known not to affect nearby disks.
I don't have a gaussmeter. The PSU is a Corsair HX520, based on a Seasonic. In 2008 this PSU was ranked as one the best (like most Seasonic), including in reviews testing voltage stability, quality of the components, respect of the norms,... This model shouldn't have magnetic problem. Moreover I know I'm not the only one who has this PSU in the bottom of the box, near the disks and I'm not aware of similar problem. It's possible that
my PSU as a problem though. But I don't have an other PSU and I don't want to buy want. If someone tells me that it's common some PSU damage disks, I would investigate this more seriously. But according to the members posting in this topic, it's not common.
Vulcan wrote:
It would also be interesting to see if the disk benchmark graph changed at all, when that disk is removed from the case and moved as far away from the PSU as possible (if necessary by using max length SATA cables). If the disk benchmark graph did change (and improved i.e. fewer or smaller "dips") that would fit with the possibility of a problem caused by a magnetic field from the PSU. However the lack of a change in the graph would not disprove that possibility.
Good idea. I'll
try to do the test.
Vulcan wrote:
Bernd wrote:
In the P182 the drives are in a vertical position, not horizontal. Could it be the cause?
I haven't checked the datasheets for the 3 drives that you mentioned (you can do that). However all modern disks which I have seen, are specified for operation when mounted both vertically & horizontally.
I'm aware of a forum with a lot of owners of my box, I'll ask if some have problems
Vulcan wrote:
In summary: IMHO remote diagnosis of the problem(s) on your system, is likely to be very difficult as they appear to be intermittent
Yes I know. I see no obvious reason for my problem. I already posted on some forums (less HDD oriented that hddguru) and most people think it's bad luck. As far as I know (or actually read) bad sector is mostly bad luck, that it's just your sample is a bad one. But three disks in a row, it's hard to believe.
Vulcan wrote:
Good luck

Thanks
fzabkar wrote:
Vertical mounting, at least in Seagate's case, is a problem for some drives, regardless of their specifications. For example, people have reported that the ST32000542AS drives perform very poorly when mounted vertically. Reallocated sectors develop on a regular basis.
See this thread where several new ST32000542AS drives developed bad sectors when mounted vertically , but not horizontally:
http://forums.seagate.com/t5/Internal-A ... /m-p/42506Thanks for the link.
fzabkar wrote:
ISTM that the dips in HD Tune's read benchmark graph are probably consistent with the slowdowns in ESTool and HDAT2. To confirm this with MHDD, you need to temporarily reconfigure your SATA controller for IDE, legacy, or compatibility mode in your BIOS setup.
Well if the BIOS sets in AHCI, MHDD see no drive. In IDE mode, it sees my DVDROM, the Western Digital but not the Samsung. I didn't test with only the Samsung plugged though. But as you said it's problably consistent.
fzabkar wrote:
Are you sure your files were corrupted? I'm not using a recent MS OS, but in earlier OSes, SFC (System File Checker) compared the DLL and EXE files against their original installation versions. Hence, SFC was incapable of differentiating between genuine updates and malicious changes or file corruptions. Is it possible that your "corrupted" files were actually MS updates?
Yes. Exe, dll, should begin with MZ. The content of the files looks to be compressed data (it can't be compressed by zip or rar). I'm unable to identify really the data, no obvious strings. All corrupted files are similars. I attached a corrupted file (HelpPane.exe).
I know it looks like a virus but I have an AV (Kaspersky) running in background, also did a full scan and it found nothing. I checked the running tasks, the executables ran at startup, the services,...: nothing suspect. Moreover the modification time of the corrupted file were not changed. My supposition is that the MFT or something is corrupted and some files index incorrectly point to wrong sectors (containing jpg, divx,...)
Attachment:
HelpPane.rar [485.57 KiB]
Downloaded 661 times