MultiDrive – free backup, clone & wipe disk utility from Atola Technology

All times are UTC - 5 hours [ DST ]


Switch to mobile style


Post new topic Reply to topic  [ 3 posts ] 
Author Message
 Post subject: Weird, rare, random file corruption after defrag (only)
PostPosted: March 4th, 2013, 12:16 
Offline

Joined: March 4th, 2013, 11:39
Posts: 3
Location: Switzerland
Hi...

I seem to have been experiencing this kind of corruption for a few years, without really noticing it (it is rare, and only seems to affect small files).

Anyway, first the specs:

It's an older ASUS P5B Deluxe Mainboard (P965 ICH8R Socket 775) with 2x 2GB Sticks of DDR2-RAM
(one of the RAM-sticks died a couple of months ago, but the remaining one and the replacement did pass a couple of hours of MemTest86 with no problem)

What happens is that maybe once in 30 defrags, no matter what tool i use, even those who only use Microsoft's API to move stuff around, one or more files (so far it only happened to small ones, like .dlls) gets corrupted in a strange way. Looking at the file content revealed that many of them are being cut off, and parts of another file fill up the remaining space. Sometimes, the whole original conent is substituted with different data.
The operating system does not notice anything suspicious, nor does the S.M.A.R.T data of the Harddisk show any problems (as far as i can can see, anyway). It can happy that i do a quick defrag, and after rebooting Windows complains about a corrupt system-dll (which then turns out to consist of pure XML-code from a different file).
Oh, yeah, and it seems to happen preferably with files in the "Layout.ini", mostly small system files that get moved to the beginning of the Disk during defrag.

I have never seen stuff like this before on any machine....could it really be that this is something hardware-related?
Or should i be more focused on software, because there's a Security-App (Agnitum Outpost Firewall) that hooks some kernel functions to monitor low-level Hard Disk access (i'm trying to reproduce the problem, but it seems to happen more rarely now that i'm looking for it :-)

Does anybode have at least an idea of what might be happening here?

Here's the dump from CrystalDisk:

Quote:
----------------------------------------------------------------------------
CrystalDiskInfo 3.9.0 (C) 2008-2010 hiyohiyo
Crystal Dew World : http://crystalmark.info/
----------------------------------------------------------------------------

OS : Windows XP Professional SP3 [5.1 Build 2600] (x86)
Date : 2013/03/04 16:49:30

-- Controller Map ----------------------------------------------------------

+ Intel(R) 82801HR/HH/HO SATA AHCI Controller [ATA]
- SAMSUNG SP2504C

-- Disk List ---------------------------------------------------------------
(1) SAMSUNG SP2504C : 250.0 GB [0-2-0, pd1]

----------------------------------------------------------------------------
(1) SAMSUNG SP2504C
----------------------------------------------------------------------------
Model : SAMSUNG SP2504C
Firmware : VT100-41
Disk Size : 250.0 GB (8.4/137.4/250.0)
Buffer Size : 8192 KB
Queue Depth : 32
# of Sectors : 488397168
Rotation Rate : Unknown
Interface : Serial ATA
Major Version : ATA/ATAPI-7
Minor Version : ATA/ATAPI-7 T13 1532D version 4a
Transfer Mode : SATA/300
Power On Hours : 21357 hours
Power On Count : 1581 count
Temparature : 40 C (104 F)
Health Status : Good
Features : S.M.A.R.T., AAM, 48bit LBA, NCQ
APM Level : ----
AAM Level : FE80h [ON]

-- S.M.A.R.T. --------------------------------------------------------------
ID Cur Wor Thr RawValues(6) Attribute Name
01 100 100 _51 000000000020 Read Error Rate
03 100 100 _25 000000001780 Spin-Up Time
04 _97 _97 __0 000000000C5C Start/Stop Count
05 253 253 _10 000000000000 Reallocated Sectors Count
07 253 253 _51 000000000000 Seek Error Rate
08 253 253 _15 000000000000 Seek Time Performance
09 100 100 __0 00000000536D Power-On Hours
0A 253 253 _51 000000000000 Spin Retry Count
0B 253 100 __0 000000000000 Recalibration Retries
0C _99 _99 __0 00000000062D Power Cycle Count
BB _48 _48 __0 000000140035 Reported Uncorrectable Errors
BE 118 100 __0 000000000028 Airflow Temperature
C2 118 100 __0 000000000028 Temperature
C3 100 100 __0 0000120FFBC2 Hardware ECC recovered
C4 253 253 __0 000000000000 Reallocation Event Count
C5 253 100 __0 000000000000 Current Pending Sector Count
C6 253 253 __0 000000000000 Uncorrectable Sector Count
C7 200 200 __0 000000000000 UltraDMA CRC Error Count
C8 100 100 __0 000000000000 Write Error Rate
C9 100 _99 __0 000000000000 Soft Read Error Rate
CA 253 253 __0 000000000000 Data Address Mark Error


Top
 Profile  
 
 Post subject: Re: Weird, rare, random file corruption after defrag (only)
PostPosted: March 4th, 2013, 16:22 
Offline

Joined: March 4th, 2013, 11:39
Posts: 3
Location: Switzerland
Thanks for your reply!

Oh, man...so it can really be anything. :?
I've seen Hard Disks corrupt data, but never in this way....so i thought i could at least rule out bad hardware, but it seems i can't.

I wasn't defragging too much, maybe once a month. Only a few months ago i noticed a corrupt .exe of a program i rarely use.
And last week it happened again (this time it hit a commonly used dll), directly after a defrag. So yesterday i started torturing the HDD with defrags, and indeed i could produce 3 corrupt dll's.
Today, however, countless defrags cound not create one single bad file, despite md5-checking them..
I just hope my backups are okay, since this seems to have been going on for years, without me noticing.

Well, i'm not trusting this system anymore...but i'm curious enough to continue to use it :mrgreen:
I'll put in a new System Disk, new RAM and set up a new System....and see what happens next :D


Top
 Profile  
 
 Post subject: Re: Weird, rare, random file corruption after defrag (only)
PostPosted: March 14th, 2013, 12:40 
Offline

Joined: March 4th, 2013, 11:39
Posts: 3
Location: Switzerland
Solved.
It was indeed the Hard Disk.

Additionally, the error dit NOT happen when the drive was in IDE mode, but only when operating with NCQ enabled (AHCI).
I've tried to reproduce the error by copying huge numbers of small files over to the drive and md5-checking them, but to no avail. It only happened to small files when defragging them.
So my best guess is that the corruption occurs in an internal buffer when handling NCQ....because the filesystem was always logically fine, it's only file contents that got truncated.

In case anyone ever experiences the same symptoms and wonders if something this crazy is possible: it is. :D


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 3 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot] and 52 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group