July 24th, 2013, 5:30
July 24th, 2013, 7:57
fzabkar wrote:Keatah wrote:Saving $0.05 USD on a NAND chip can mean millions in extra profit.
What percentage of SSD faults are the result of NAND flash failures?
AFAICT from various storage forums, including SalvationData's, the most common failure mode appears to involve corruption of the Flash Translation Layer. In such cases there is no actual failure in any physical component. Can anyone enlighten me if I'm wrong?
this was 2 years after the initial release. Does it mean that for 2 years, this cap was not know to be required so possibly causing issues with devices? maybe.In Table 1 on page 10, added “Connect this pin with a 4.7uF capacitor to ground.” to
VREG “External Capacitor Pin”.
was this caught before or after a few thousand got n the wild?Re-order pin location due to floor plane issue
To avoid crystal power noise affecting PHY, separating the power domain by pin 16 and 17
SM3257ENAAISP fix MAC compatibility issue
- a very popular and widespread controller.3257EN ISP fix NTFS format will mis-compare fail issue.
SM3257ENAAISP fixed the system block have ECC fail and initial fail.
SM3257ENAAISP fixed 5V normal power cycling FAT block serial number error issue
SM3257ENAAISP fixed double mark bad block issue
July 24th, 2013, 9:13
fzabkar wrote:AFAICT from various storage forums, including SalvationData's, the most common failure mode appears to involve corruption of the Flash Translation Layer. In such cases there is no actual failure in any physical component. Can anyone enlighten me if I'm wrong?
July 24th, 2013, 10:44
Reliable NAND management is increasingly complicated
–Bit error probabilities increase with shrinking geometries and MLC
Bit “disturbs” are inherent to the NAND architecture
Cells not being programmed received elevated voltage stress.
Charge collects on the floating gate causing cell to appear weakly programmed
This paper examines the complex flash
errors that occur at 30-40nm flash technologies. We demonstrate
distinct error patterns, such as cycle-dependency, location dependency
and value-dependency, for various types of flash
operations. We analyse the discovered error patterns and explain
why they exist from a circuit and device standpoint.
Bit error rate in NAND Flash memories
Published in: Reliability Physics Symposium, 2008. IRPS 2008. IEEE International
Date of Conference: April 27 2008-May 1 2008
..NAND flash memories have bit errors that are corrected by error-correction codes (ECC). We present raw error data from multi-level-cell devices from four manufacturers, identify the root-cause mechanisms, and estimate the resulting uncorrectable bit error rates (UBER). Write, retention, and read-disturb errors all contribute...
July 24th, 2013, 13:32
July 24th, 2013, 14:36
Keatah wrote:Seems mfgs are going to greater and greater lengths to have the controller cover-up internal errors, which are occurring at smaller and smaller geometries. When is the point of no return matched?
July 24th, 2013, 18:18
Doomer wrote:fzabkar wrote:AFAICT from various storage forums, including SalvationData's, the most common failure mode appears to involve corruption of the Flash Translation Layer. In such cases there is no actual failure in any physical component. Can anyone enlighten me if I'm wrong?
And what would be the cause of "Flash Translation Layer" corruption? Translation tables live on the very same NAND chips
The issue usually happens when you have writing problems of any kind
July 24th, 2013, 18:50
HaQue wrote:There are a few resources that mention some reasons why bits fail, so ISTM that where there is bits failing, there can also be bytes failing.. It seems "it is just the way it is"
July 24th, 2013, 20:51
fzabkar wrote:I suspect that much of this corruption occurs as a consequence of unexpected power loss.
July 25th, 2013, 1:04
July 25th, 2013, 18:28
Doomer wrote:fzabkar wrote:(128 gigabytes x 3000 cycles) / (10 years) = 105 gigabytes per day
FYI 105GB/day is not much
July 25th, 2013, 18:36
July 25th, 2013, 19:33
Doomer wrote:When you read from SSD often it causes charges in NAND cells to dissipate, that means if you only read data you still need to re-copy it after 5-10 reads from the same NAND cell to another NAND cell
July 25th, 2013, 19:52
fzabkar wrote:This statement sounds absurd. In fact I find absolutely no reference to any such constraints in any NAND flash datasheets.
July 25th, 2013, 21:27
Reliability Guidance
This reliability guidance is intended to notify some guidance related to using NAND flash with
1 bit ECC for each 512 bytes. For detailed reliability data, please refer to TOSHIBA’s reliability note.
Although random bit errors may occur during use, it does not necessarily mean that a block is bad.
Generally, a block should be marked as bad when a program status failure or erase status failure is detected.
The other failure modes may be recovered by a block erase.
ECC treatment for read data is mandatory due to the following Data Retention and Read Disturb failures.
Write/Erase Endurance
--------------------------
Write/Erase endurance failures may occur in a cell, page, or block, and are detected by doing a status read
after either an auto program or auto block erase operation. The cumulative bad block count will increase
along with the number of write/erase cycles.
Data Retention
-----------------
The data in memory may change after a certain amount of storage time. This is due to charge loss or charge
gain. After block erasure and reprogramming, the block may become usable again.
Read Disturb
---------------
A read operation may disturb the data in memory. The data may change due to charge gain. Usually, bit
errors occur on other pages in the block, not the page being read. After a large number of read cycles
(between block erases), a tiny charge may build up and can cause a cell to be soft programmed to another
state. After block erasure and reprogramming, the block may become usable again.
July 25th, 2013, 23:43
fzabkar wrote:What about flash "EEPROMs" such as those used in my 10-year-old PC?
July 25th, 2013, 23:54
July 26th, 2013, 4:06
July 26th, 2013, 4:14
Vulcan wrote:I believe that Doomer is referring to the phenomenon of "read disturb". Research that and you'll find it really does exist and yes, it's not in the typical datasheets - but it's realas various technical papers on issues with NAND flash technology will confirm.
July 26th, 2013, 9:21
fzabkar wrote:Doomer wrote:When you read from SSD often it causes charges in NAND cells to dissipate, that means if you only read data you still need to re-copy it after 5-10 reads from the same NAND cell to another NAND cell
This statement sounds absurd.
fzabkar wrote:Are you suggesting that those millions of devices that are built around SoCs and embedded flash memory need to "refreshed" after very 5-10 power-on events? What about flash "EEPROMs" such as those used in my 10-year-old PC?
Powered by phpBB © phpBB Group.