All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 12 posts ] 
Author Message
 Post subject: About ECC
PostPosted: June 28th, 2022, 8:25 
Offline
User avatar

Joined: May 13th, 2019, 7:50
Posts: 907
Location: Nederland
And was wondering about this ..

If I am correct 'average' ECC can correct 1 bit error and detect 2. Does this imply 3 bit errors would go undetected?

_________________
Joep - http://www.disktuna.com - video & photo repair & recovery service


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 28th, 2022, 15:30 
Offline

Joined: October 3rd, 2005, 0:40
Posts: 4311
Location: Hungary
not exactly, the correction capability of the code varies by the implementation. Detection is highly reliable, it is very hard to produce a data that would produce the same ECC and thus go undetected...
But i am not a math expert in BCH.

pepe

_________________
Adatmentés - Data recovery


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 28th, 2022, 15:50 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15463
Location: Australia
I would think that a particular algorithm might be designed to correct single-bit errors and detect multi-bit errors.

BTW, I can take a Seagate F3 ROM and relatively easily create a 2-bit error which will go undetected. However, this CRC algorithm is one of the simpler ones.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 28th, 2022, 20:38 
Offline

Joined: September 29th, 2005, 4:10
Posts: 402
Location: Moscow
WD Drive Info Table.rep:

Error Correction Information
No. of Interleaves...................... : 1
OTF Corr. Span in Bytes................. : 22
Max FW Corr. Span in Bytes.............. : 43
No. of ECC Bytes + CRC Bytes............ : 55


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 8:43 
Offline
User avatar

Joined: May 13th, 2019, 7:50
Posts: 907
Location: Nederland
I am no ECC expert either, I suck at math. I am asking because I was watching this, listen closely to what he says:

https://youtu.be/fqzv2YXMFRs?t=1267

This is Google's poor CC:

Code:
doesn't always work and this is one of the one of the reasons why this work you

know you have you could have ECC such as single error correct error depicts expected but if you take out three cells

in the same word lines three bits in the same word line that ecc doesn't work anymore in fact

it doesn't even detect it it just goes silent so this one of the techniques

that that that we use is to interleave the the bits so that physically bits


To me it sounds as if he's suggesting that ECC will fail at detecting errors if number of bit errors too large under 'circumstances' (not interleaving bits) and more than 2 bits in a row flip. So if we'd take this to HDD, would ECC fail to detect error in that case?

The video is quite interesting to watch anyway IMO. Short: cosmic rays may be one of the major contributors to silent data corruption.

_________________
Joep - http://www.disktuna.com - video & photo repair & recovery service


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 10:52 
Offline

Joined: October 3rd, 2005, 0:40
Posts: 4311
Location: Hungary
CRC in F3 rom is 16bit, NAND ECC is a lot more (like Tomset said), plus i doubt they use CRC if ECC is present, coz error detection is done by syndrome bytes, which are easily calculated in HW and required for correction too, as far as i remember. So an additional CRC would be a waste of resources and pointless, if a stronger algo is done anyway.

Since this thread is in NAND section, i suppose we are talking about NAND and not HDD. HDD uses Reed-Solomon codes, NAND use BCH (mostly) which are completely different algos. So that info about WD does not apply to NANDs. Reed-Solomon is good at burst errors (HDD sectors are likely corrupted in bursts), while BCH is better for single bit errors, which NAND memories suffer from.
If we look at the page layout of NAND controllers, we see that the number of ECC bytes is like 100+ bytes, which does not say much in itself, we must take the size of data area into consideration too.

What i miss from Tomset's post is the native sector size of that HDD, it can be 512 or 4096 bytes, probably 512 though.


pepe

_________________
Adatmentés - Data recovery


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 13:53 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15463
Location: Australia
The idea behind interleaving is that physical damage cannot wipe out an entire "ECC group".

This would be vulnerable to a burst error:

    Bit1 Group1, Bit2 Group1, Bit3 Group1, Bit4 Group1, ...

This would be invulnerable:

    Bit1 Group1, Bit1 Group2, Bit1 Group3, Bit1 Group4, ...

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 13:54 
Offline
User avatar

Joined: May 13th, 2019, 7:50
Posts: 907
Location: Nederland
pepe wrote:
NAND use BCH (mostly) which are completely different algos.
pepe


Yes, but it still does not answer the question if there's a maximum to amount of bit errors that can be detected, right? Or am I missing it? The Intel guy in the video at shared time-mark seems to suggest that if we'd solely rely on ECC, 3 bit errors would go undetected.

_________________
Joep - http://www.disktuna.com - video & photo repair & recovery service


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 17:00 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15463
Location: Australia
Using the CRC-16 algorithm in HxD, I can create an artificially corrupt file out of a zero-filled file. The initial and final CRCs are both 0x0000, which means that the corruption has gone undetected.

Code:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 7 flipped
........
00001000  40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 6 flipped
........
00002000  20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 5 flipped
........
00003000  10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 4 flipped
........
00004000  08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 3 flipped
........
00005000  04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 2 flipped
........
00006000  02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 1 flipped
........
00007000  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <-- bit # 0 flipped
........
00007FF0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 18:27 
Offline

Joined: October 3rd, 2005, 0:40
Posts: 4311
Location: Hungary
I can't watch it right now. However, since the code distance, it would need a very different code word to produce the very same syndomes, so with few errors i doubt it could happen. Franc's example shows that he needs to change 8 locations pretty precisely to produce the same 16bit CRC over a 32k block, which is not comparable to the BCH's 100+ bytes over 1024.
Franc, could you calculate the chances for such thing to happen accidentially? It is easy to violate a code if you know its properties but such thing to happen accidentially is extremely low. and it is only a 16bit code over 256kbits.

pepe

_________________
Adatmentés - Data recovery


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: June 29th, 2022, 18:40 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15463
Location: Australia
I only need to flip 2 bits to produce the same CRC. That means there is one chance in 64K that the second flipped bit occurs in the "right" place.

I changed 8 bits just to demonstrate the pattern. In fact the pattern looks like interleaved 0x1000-byte blocks. To compensate for a flipped bit in one block, find the same byte in the next block and flip the next bit.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: About ECC
PostPosted: July 21st, 2022, 7:49 
Offline

Joined: August 13th, 2016, 17:10
Posts: 193
Location: Vienna, Austria
A little bit of theory: With a CRC algorithm, when you change a single bit of the input, on average a single bit of the output changes. This helps 100% against single bit flips, but as soon as you have several bit flips, you chance is that they will cancel out and the errors will go unnoticed. With e.g. TCP/IP you have a 16 Bit CRC protection per IP packet. One practical example I experienced in the past is that it's impossible to transfer a CD image (650 MB of data) over SSH/SCP over a noisy network cable, since you will have too many errors that punch through the CRC of TCP/IP.
So CRC's are a very weak protection (but it's also very fast compared to the other methods)

With cryptographic hashing functions, when you change a single bit of the input, on average half of the output bits are changed. This is the strongest way to detect any number of bit changes. For a good hash function like e.g. SHA2-384 you have a chance of 1:2^384 to have a collision, which is so rare that you will most likely never experience on in your lifetime.

CRC/Hashes adds between 8 and 512 bits overhead to detect bit errors.

Now CRC's and hashes both only detect errors, they cannot fix them. ECC or FEC (Error Correction Codes / Forward Error Correction) add more redundant data, so that you not only can detect bit errors, but you can also fix bit errors. And there are flexible algorithms which you can tune, for e.g. taking 1024 bits of input, being able correct 10 bit errors and being able to detect 16 bit errors. The idea behind that capability is so that you can fix a few errors, but detect when there are even more than just a few errors, to report that the whole sector is bad and beyond automatic repair, and request a re-transmit, or to declare a sector bad.
As a general rule of thumb, ECC can add 30% of overhead, but it highly depends on the tuning, the more bit errors you want to be able to repair and detect, the more overhead you need.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 36 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group