All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 31 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Samsung 870 Evo -- is it a dud model?
PostPosted: November 19th, 2022, 14:15 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
I have seen numerous threads involving premature failures in Samsung's 870 Evo SSD. The SMART reports would suggest that these SSDs are populated with flaky NAND (128L TLC). Are there any other recent models that are similarly affected?

https://www.techpowerup.com/forums/threads/samsung-870-evo-beware-certain-batches-prone-to-failure.291504/
https://goughlui.com/2022/08/20/notes-ssdraid-recovery-samsung-870-evo-not-to-be-trusted/
https://www.reddit.com/r/techsupport/comments/og701s/should_i_rma_my_samsung_870_evo_2tb/

3RAD1CAT0R wrote:
I bought 14x 1TB 870 EVOs mid November 2021. 7 of them failed on the same day about a week ago with varying amounts of uncorrectable sectors. All about 850GB written, 90 days of power on time at time of failure. 3 in one 4 drive raid 0, and all 4 in another 4 drive raid 0. The remaining 1 survived a 2tbw stress test, so I'm gonna assume it's fine. Moved all content off the other 6 I have just in case, still need to stress those a bit.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 20th, 2022, 0:25 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
https://www.reddit.com/r/unRAID/comments/r4rcqg/warning_multiple_870_evo_failures_often_many_ssds/

ThomasGlanzmann wrote:
I have the very same issue. Devices 3-6 months old failing. I have 8 870 EVOs and 5 960 PRO NVMe that failed on me. Or to put it in other words out of the 15 Samsung SSDs that I bought this year, 13 failed on me. Just this morning another 870 EVO failed on me with 14 TB written.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 20th, 2022, 17:39 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
I have tried to identify the NAND part numbers. The 4TB example uses a 1TB NAND with a different part number. I don't know if this represents a BOM change.

K9DVGY8J5BDCK0 = 1TB NAND
K9DYGB8J1BDCK0 = 1TB NAND

https://www.relaxedtech.com/reviews/samsung/870-evo/ (K9DVGY8J5BDCK0, 1TB NAND)
https://www.cdrlabs.com/reviews/samsung-870-evo-1tb-solid-state-drive/all-pages.html (K9DVGY8J5BDCK0, 1TB NAND)
https://www.tweaktown.com/reviews/9710/samsung-870-evo-4tb-sata-ssd/index.html (K9DYGB8J1BDCK0, 1TB V-NAND 3-bit MLC (TLC))
https://forum.shiftdelete.net/konular/samsung-870-evo-inceleme.757429/ (K9DVGY8J5BDCK0, 1TB NAND)
https://forum.hddguru.com/viewtopic.php?p=300460 (870 Evo 500GB, K90UGY8J5BCCK0, 512GB NAND)

This 980 Pro NVMe SSD uses the same 1TB NAND (K9DVGY8J5BDCK0 sixth-generation stacked V-NAND flash memory, with a single capacity of 1TB):

https://zhuanlan.zhihu.com/p/468526275 (980 Pro m.2)

This thread lists several NAND failures in the PM9A1 model (= MZVL22T0HBLB = 980 Pro):

https://www.chiphell.com/thread-2435363-1-1.html

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 20th, 2022, 17:43 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
Upgrading from SVT01B6Q firmware to SVT02B6Q adds an extra, vendor specific, SMART attribute (ID = 0xFC) for the 870 Evo.

https://www.techpowerup.com/forums/attachments/1649706326413-png.243253/
https://www.techpowerup.com/forums/threads/samsung-870-evo-beware-certain-batches-prone-to-failure.291504/page-3

https://en.wikipedia.org/wiki/Self-Monitoring,_Analysis_and_Reporting_Technology

Quote:
252 / 0xFC - Newly Added Bad Flash Block

The Newly Added Bad Flash Block attribute indicates the total number of bad flash blocks the drive detected since it was first initialized in manufacturing.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 20th, 2022, 19:34 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
More bad stories ...

https://www.computerbase.de/forum/threads/2-x-samsung-evo-870-4-tb-nach-knapp-einem-jahr-mit-defekten-sektoren.2063499/page-2

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 23rd, 2022, 16:21 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
These models seem to be featuring in the forums:

    980 Pro
    PM9A1
    970 Evo (?) & Evo Plus

https://www.reddit.com/r/DataHoarder/comments/x8arle/psa_samsung_980_pro_users_in_china_are_observing/
https://web.archive.org/web/20221009111526/www.cnbeta.com/articles/tech/1313507.htm

Quote:
Many users in China complain about loss of data due to bad blocks in Samsung 980 Pro. It's now reported by mainstream media in China. This usually happens when the SSD has been used for 6-12 months. Samsung in China allegedly issued a statement but it was quickly pulled off.

Problems are showing up in the following SMART attributes:

    Available Spare (Percent)
    Media and Data Integrity Errors

This thread reports that Samsung has changed the NAND and controller in the newest version of their 970 Evo Plus:

https://www.techpowerup.com/286008/et-tu-samsung-samsung-too-changes-components-for-their-970-evo-plus-ssd

The new NAND has a part number similar to the NAND used in the 870 Evo, namely K9DUGY8J5B-CCK0.

The older version of the 970 Evo Plus used K9DUGY8J5B-DCK0 NAND.

https://www.computerbase.de/2021-08/970-evo-plus-auch-samsung-tauscht-ssd-komponenten-aus/

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 25th, 2022, 15:35 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
Many more reports of failures in 970 Evo:

https://www.reddit.com/r/buildapc/comments/x82mwe/samsung_ssd_smart_0e_issue/

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 30th, 2022, 12:16 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
This 2TB 980 Pro was used in Chia mining and wrote 3.54PB before "failing":

https://www.reddit.com/r/chia/comments/somv5o/samsung_2tb_980_pro_is_finally_reporting_failed/

    Media and Data Integrity Errors = 0​
    Available Spare = 100%​

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: November 30th, 2022, 12:22 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
Here is a thread where users are reporting uncharacteristically high failure rates in recent versions of Crucial's MX500:

https://ww.reddit.com/r/sysadmin/comments/whr5ek/crucial_mx500_historically_good_recent_batches/

Here is a good source of info on the evolution of the MX500:

https://theoverclockingpage.com/2022/07/27/review-crucial-mx500-1tb-um-dos-melhores-ssds-satas-do-aliexpress-com-dram-cache/

You can see how Crucial has changed the NAND and controller over the years:

https://theoverclockingpage.files.wordpress.com/2022/07/variantes.png

https://theoverclockingpage.files.wordpress.com/2022/07/especificacoes.jpg

The author claims that there is even a QLC version in the 2TB and 4TB capacities.

The most recent NANDs are "176-layer Micron FortisFlash B47R Replacement Gate Charge Trap NAND". The marking codes are NY133 and NY135.

    NY135 = MT29F8T08EWLEEM5-QA:E (8 Tbit)

    NY133 = MT29F2T08EMLEEJ4-QA:E (2 Tbit)

Older MX500 versions had Micron NW925 and NW926 NANDs.

    NW925 = MT29F512G08EECAGJ4-5M:A (512 Gbit)

    NW926 = MT29F1T08EMCAGJ4-5M:A (1 Tbit)

Micron's FBGA and Component Marking Decoder:

https://www.micron.com/support/tools-and-utilities/fbga

FlashMaster's NAND flash part number and ID decoder:

https://nand.gq/#/decode

The latest versions have a Silicon Motion SM2259 controller. Earlier versions had an SM2258.

I have posted info, including hires PCB photos, of my 1TB MX500 here (fw M3CR043):

https://forums.tomshardware.com/threads/crucial-mx500-500gb-sata-ssd-remaining-life-decreasing-fast-despite-few-bytes-being-written.3571220/post-22866935

I purchased an MX500 1TB SSD earlier this year. I haven't used it much, but now I'll be paying special attention to it. :-(

Here are scans of the PCBs:

http://users.on.net/~fzabkar/SSD/Micron/MX500/

The NAND flash is Micron MT29F2T08EMLEEJ4-QA:E (part marking NY133).

The flash controller is a Silicon Motion SM2259H-AC with a YYWW (Year/Week) date code of 2137 (week 37 of 2021).

The SDRAM is Micron MT41K256M16TW-107:p (part marking D9SHD ).

https://www.micron.com/products/dram/ddr3-sdram/part-catalog/mt41k256m16tw-107
https://media-www.micron.com/-/media/client/global/documents/products/data-sheet/dram/ddr3/4gb_ddr3l.pdf?rev=c2e67409c8e145f7906967608a95069f

Firmware is M3CR043.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: January 31st, 2023, 19:19 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
I wonder what this "fix" entails:

Samsung Issues Fix for Dying 980 Pro SSDs:
https://www.tomshardware.com/news/samsung-980-pro-ssd-failures-firmware-update

How does one "fix" bad NAND (?) with a firmware update?

The 990 Pro appears to have its own problems:

https://www.guru3d.com/news-story/samsung-990-pro-users-report-a-rapidly-declining-lifespan-for-the-ssd.html

https://www.tomshardware.com/news/samsung-990-pro-health-dropping-fast

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: February 1st, 2023, 2:16 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
Samsung's web site has a firmware update for the 870 EVO.

There is this advisory note:

Quote:
*The 870 EVO model will be manufactured with a revised V6 process starting November 2022.

I wonder what that means. New NAND?

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 6th, 2023, 14:31 
Online

Joined: October 3rd, 2005, 0:40
Posts: 4311
Location: Hungary
I was looking for 970 pro the other day... which is (was) 2bit MLC. All TLC is likely to fail within a short time, 1-2-3 years max i think.
I am using an old OCZ Revodrive 350 which has a whole lot of NAND chips, TH58TEG7DDJBA4C, probably MLC, still in good shape after like 6-7 years of 24/7 usage. So i wanted some ssd with MLC chips and was unable to find any. :s

i am disappointed, you cannot buy mlc ssd even if you are ready for a higher price...
crap

pepe

_________________
Adatmentés - Data recovery


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 6th, 2023, 22:52 
Offline
User avatar

Joined: July 8th, 2019, 12:27
Posts: 146
Location: 中国大陆浙江省湖州市
If anyone is using 870 EVO, 970 EVO Plus, 980, 980 PRO, or PM981A SSDs, please update the firmware as soon as possible. These drives have been associated with premature degradation, and if you notice abnormal values in the Smart (OE) attributes of these drives, please back up your data promptly. Samsung has released corrected firmware updates, but I believe the primary issue stems from premature degradation of flash memory chip cells.

The operation of the corrected firmware may be similar to the failure patterns seen during the time of 840 EVO release. To avoid a large-scale recall, Samsung has added in the firmware logic: after a certain time, the controller rewrites cold data to new locations to mitigate data loss caused by cell degradation. However, in this mode, the actual write rate to the flash memory increases, reducing the flash memory's durability.

I recently conducted in-depth research on these chips in my work, and there will be a better understanding of these chips in the near future.

_________________
Auxiliary Tool Used For MonoLith Data Recovery, featuring the industry's most extensive Monolith pinouts
http://flash-matrix.com/


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 7th, 2023, 3:05 
Offline

Joined: November 7th, 2020, 5:31
Posts: 1084
Location: The_UK
csava wrote:
To avoid a large-scale recall, Samsung has added in the firmware logic: after a certain time, the controller rewrites cold data to new locations to mitigate data loss caused by cell degradation.
That's very interesting - I wonder what the threshold considered cold or at risk is set to as that really could chew through a drives health very quickly.

csava wrote:
I recently conducted in-depth research on these chips in my work, and there will be a better understanding of these chips in the near future.
Always interested in whatever you share - even when it requires research to understand it :lol:

_________________
Data Recovery Services in the UK.
https://www.usbrecovery.co.uk/


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 7th, 2023, 5:05 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
csava wrote:
The operation of the corrected firmware may be similar to the failure patterns seen during the time of 840 EVO release. To avoid a large-scale recall, Samsung has added in the firmware logic: after a certain time, the controller rewrites cold data to new locations to mitigate data loss caused by cell degradation. However, in this mode, the actual write rate to the flash memory increases, reducing the flash memory's durability.

I suspected as much. Thanks very much for the confirmation.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 7th, 2023, 5:38 
Offline
User avatar

Joined: July 8th, 2019, 12:27
Posts: 146
Location: 中国大陆浙江省湖州市
Lardman wrote:
csava wrote:
To avoid a large-scale recall, Samsung has added in the firmware logic: after a certain time, the controller rewrites cold data to new locations to mitigate data loss caused by cell degradation.
That's very interesting - I wonder what the threshold considered cold or at risk is set to as that really could chew through a drives health very quickly.

csava wrote:
I recently conducted in-depth research on these chips in my work, and there will be a better understanding of these chips in the near future.
Always interested in whatever you share - even when it requires research to understand it :lol:

The SSD's controller and firmware logic are essentially a black box operation for us, and I don't have the capability to decompile the firmware to fully understand it. However, there are still ways to indirectly verify what operations the controller is currently performing. Even with the same controller and the same NAND flash, their operation can be entirely different if the firmware differs. I have read some literature written by controller design engineers and research papers from universities. They mention that there is a part of the firmware module responsible for recording the write counts of flash memory cells and marking the timestamp of the last write. Using these markers, the controller can implement more intelligent wear-leveling and manage hot and cold data. I have indeed conducted related tests and found that when the controller reads units with higher error rates, the current drawn by the SSD suddenly increases at this time point. You can observe a long RB# time in a logic analyzer, and if the firmware supports hot and cold data management, it will perform write operations after reading is complete. Using a logic analyzer, I can observe this. In Windows Manager, the SSD shows 100% utilization. The terminal also frequently transmits some incomprehensible data during this time.

_________________
Auxiliary Tool Used For MonoLith Data Recovery, featuring the industry's most extensive Monolith pinouts
http://flash-matrix.com/


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 7th, 2023, 21:53 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
The Endurance Group Log (Log Identifier 09h) defines a "Media Units Written" parameter. If supported by the drive, it could tell us the Write Amplification.

https://nvmexpress.org/wp-content/uploads/NVM-Express-1_4-2019.06.10-Ratified.pdf

The image is OK after you click on it. :-?

Attachment:
Endurance_Group_Log_09h.gif
Endurance_Group_Log_09h.gif [ 73.65 KiB | Viewed 6577 times ]

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 8th, 2023, 3:09 
Offline
User avatar

Joined: July 8th, 2019, 12:27
Posts: 146
Location: 中国大陆浙江省湖州市
fzabkar wrote:
The Endurance Group Log (Log Identifier 09h) defines a "Media Units Written" parameter. If supported by the drive, it could tell us the Write Amplification.

I know about this 'Smart' option, but I believe that these values can be unreliable in certain situations. Any option in the Smart features can be configured by the controller manufacturer and the production company to appear however they need it to be shown to you. For example, there have been consumer reports stating that certain SSD models using a particular controller tend to overheat, leading to performance throttling and premature flash chip degradation. This information has spread across the internet, causing the manufacturers to urgently address the issue to counter the declining reputation of their product.

How can this situation be resolved? If it's a hardware design flaw that can't be perfectly fixed through a firmware update, an honest company like Intel might choose to recall the problematic products. However, other brands with a history of poor practices, such as Samsung and Western Digital, might only release firmware updates. These updates could adjust the offset of the temperature sensor (actual temperature 80 degrees Celsius, displayed as 65 degrees Celsius, offset -15 degrees Celsius), reduce the frequency at which the controller core and flash memory operate, and modify the duration of full-speed mode to address the overheating issue. This is a common practice in the industry, and even many reputable big brands do the same.

_________________
Auxiliary Tool Used For MonoLith Data Recovery, featuring the industry's most extensive Monolith pinouts
http://flash-matrix.com/


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 8th, 2023, 15:10 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
csava wrote:
However, other brands with a history of poor practices, such as Samsung and Western Digital, might only release firmware updates. These updates could adjust the offset of the temperature sensor (actual temperature 80 degrees Celsius, displayed as 65 degrees Celsius, offset -15 degrees Celsius)...

Are you referring to the "composite" temperature? IIUC, this is a value that is derived by computing a weighted average of all the temperature sensors on the drive. As such, it would be very subjective. For example, a manufacturer could lower the reported composite temperature by applying a lower weighting to the flash core without actually falsifying the core temperature reading.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: Samsung 870 Evo -- is it a dud model?
PostPosted: September 8th, 2023, 15:14 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 15461
Location: Australia
pepe wrote:
i am disappointed, you cannot buy mlc ssd even if you are ready for a higher price...

What about an enterprise SSD where the TLC cells have been reconfigured as pseudo-SLC or pseudo-MLC?

_________________
A backup a day keeps DR away.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 31 posts ]  Go to page 1, 2  Next

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: Google [Bot], westcoast and 27 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group