Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Cloning

I have a 2013 Macbook Air with a 512GB SSD that came factory installed.

As of the last couple of days I have been getting an error when booting in to Windows 7 stating that the drive is going to fail imminently and to backup; this is followed by the following in event viewer: The driver has detected that device \Device\Harddisk0\DR0 has predicted that it will fail. Immediately back up your data and replace your hard disk drive. A failure may be imminent.

Now I have booted in to Mac OS and checked the drive in the 'About Mac' dialogue and the drive shows as 'SMART Status: Failing'. I have downloaded various SMART utilities and they seem to point towards a particular attribute as being out of range (I will have to boot in to Mac OS to double check which one).

To get to the point, I have spoken to Apple and explained the issue and having the extended Apple Care they have said I should be able to get the SSD replaced if I take it to a local authorised service provider (I am currently living in Abu Dhabi, but bought the Macbook in the UK).
Herein lies my main question. Assuming I am able to get the SSD replaced with a new blank one, what is the best way to effectively clone everything that is on the current SSD, Mac OS and Bootcamp partition to the new SSD so that it functions just like it does currently?

Given that I can't readily plug in a second SATA or PCIe SSD to do a direct clone, I was wondering if anyone knows of a utility, perhaps bootable from USB, that will allow me to clone the whole drive, both partitions included (perhaps sector-by-sector?) to an external USB drive and then transfer this 'clone' or image to the new blank SSD within the same (or a different) bootable USB environment?

The new drive will (presumably) be the same size as the old one, so that should remove any complications in regards to partition sizes but I'm not familiar with software or the best way to achieve what I need to.

Any help would be greatly appreciated.

Some SMART failures don't appear to be real, especially in the case of Apple/Samsung SMxxxxF SSDs.

See http://www.tomshardware.com/answers/id- ... p-mac.html

We've been cloning a 512G MacBook SSD for several days now. It is as though one of the memory chips is barely responsive, but it looks like we will get the majority of the data recovered...assuming no major surprises with file system damage and encryption.

Easiest option for you is to download carbon copy cloner. You can get a fully working trial.

Get a usb hard drive, boot up your mac and run carbon copy cloner and clone your hard drive onto the usb hard drive.

You have 2 ways, you can either create a whole hdd clone which means you can boot from the usb hdd by holding down alt key on boot. 2nd option is to create an image, but this wont be bootable.

Once you get your mac back repaired, boot it up from the usb hdd and use carbon copy cloner to clone the usb hdd to the internal one.

Other option is to install osx on a usb hdd, use migration assistant to copy all your stuff from the internal hdd to usb. Then when you get the mac back, use migration assistant to copy everything back over.

Ideally, the cloning should take place in an environment that any OS will not attempt talking to the drive. The goal is to avoid any further writing to the drive (which happens in the background without the user's awareness). A hardware imager is the best choice if most important is minimizing data corruption, which could be found in properly outfitted data recovery companies. Software wise, depending on the level of experience, the best choice is ddrescue. Suggest practicing on ther drives first. Screwing up is common.

Sam-Al- wrote:Now I have booted in to Mac OS and checked the drive in the 'About Mac' dialogue and the drive shows as 'SMART Status: Failing'. I have downloaded various SMART utilities and they seem to point towards a particular attribute as being out of range (I will have to boot in to Mac OS to double check which one).

Could you show us a CrystalDiskInfo SMART report? I'm wondering whether attribute AD/173 is the problematic one. If so, then I suspect that there may be thousands of Apple users who are blissfully unaware of this "problem".

fzabkar wrote:
Sam-Al- wrote:Now I have booted in to Mac OS and checked the drive in the 'About Mac' dialogue and the drive shows as 'SMART Status: Failing'. I have downloaded various SMART utilities and they seem to point towards a particular attribute as being out of range (I will have to boot in to Mac OS to double check which one).

Could you show us a CrystalDiskInfo SMART report? I'm wondering whether attribute AD/173 is the problematic one. If so, then I suspect that there may be thousands of Apple users who are blissfully unaware of this "problem".

Hi fzabkar,

Recently I get falling notice of the smf0512f SSD on my MBA 13'' (mid 2013) as well.
After reading these, I think I have the similar issue.
http://www.tomshardware.com/answers/id- ... p-mac.html
http://www.reddit.com/r/applehelp/comme ... s_failing/
http://www.reddit.com/r/applehelp/comme ... sd_issues/

The S.M.A.R.T info looks alright after using for ~2yrs except that the Attribute 173 "Wear_Leveling_Count" reaches the threshold 100, with raw value of ~3000 (P/E cycles?). While attribute 174 and 175 value (read/write) is ~7TB.

Do you have any suggestions?

@cutemama, ISTM that there may be a bug in Samsung's firmware. Did you become aware of this problem via Windows? If so, does Mac OS alert you to this problem in normal usage, or do you need to run a Mac diagnostic against the drive before you see the problem?

AISI, your results can be interpreted in several ways.

Let's assume that the wear levelling is perfect, ie that all memory cells have recorded the same number of P/E cycles. Let's also assume that write amplification is 1:1.

Then ...

P/E cycles = total number of host writes / capacity = 7TB / 512GB = 13.7
Otherwise ...

Of course none of these results is consistent with the SMART data. This suggests that, either there is a firmware bug, or that our interpretation of the attributes is incorrect.

The other thing that you need to ask is whether it is even possible to write 1.5PB during the drive's lifetime.

Assuming that the drive remained powered on during its entire life ..

(1.5 petabytes) / (2 years) = 24 MB per second = 86 GB / hour

I don't think I have ever seen a flash device where the failure can be attributed to exceeding any of the flash so called "lifetime" values. It is usually bad firmware, some component going out of spec or failing, environment or unexplained. I am not sure I value SMART data on flash devices either.

I would be getting the data off asap, even if a cloning method is not decided on. Don't play around for a while trying to figure this and that, flash is unforgiving.

I don't believe there is any real problem. All the similar examples I have found show no evidence of bad blocks or problems in any other SMART attributes.

In fact there are some easy tests that could illuminate the problem.

1/ Write a 10MB file to the drive. Does the raw value of the Total_Host_Writes attribute increase by 10?

2/ Zero-fill the drive. This will erase all the data, so a backup would be mandatory. Does the Total_Host_Writes attribute increase by 512GB? Do the max/min/average raw values of the Wear_Levelling_Count increase by 1?

3/ Perform an ATA Secure Erase. This will send the erase command to the drive, after which the drive will execute it internally, without further intervention by the host. Does the Total_Host_Writes attribute remain unchanged? Do the max/min/average raw values of the Wear_Levelling_Count increase by 1?

Note that an encrypted drive will execute an ATA Secure Erase command by simply throwing away the encryption key, in which case test #3 will require only a few seconds and should not change the SMART values.

I notice that in the following example the Percentage Used Endurance Indicator is 0 (GP Log 0x04) whereas the normalised value of the Wear_Leveling_Count is 68. That obviously can't be right.

If we examine the raw value in hexadecimal, it appears to consist of three words, each of which appears to correspond to the minimum, maximum, and average P/E cycles.

16372675186484 = 0x 0EE4 0F7D 0F34 = 3812/3965/3892 P/E cycles
The normalised value appears to decrement by 1 for every 30 P/E cycles.

http://www.reddit.com/r/applehelp/comme ... sd_issues/

APPLE SSD SM0512F

Code:: SMART Attributes Data Structure revision number: 40 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate -O-RC- 200 200 000 - 0 5 Reallocated_Sector_Ct PO--CK 100 100 000 - 0 9 Power_On_Hours -O--CK 099 099 000 - 2149 12 Power_Cycle_Count -O--CK 099 099 000 - 889 169 Unknown_Attribute PO--C- 253 253 010 - 4106676608512 173 Wear_Leveling_Count -OS-CK 068 068 100 NOW 16372675186484 174 Host_Reads_MiB -O---K 099 099 000 - 5619695 175 Host_Writes_MiB -O---K 099 099 000 - 6963394 192 Power-Off_Retract_Count -O--C- 099 099 000 - 73 194 Temperature_Celsius -O---K 039 039 000 - 61 (Min/Max 14/72) 197 Current_Pending_Sector -O---K 100 100 000 - 0 199 UDMA_CRC_Error_Count -O-RC- 200 199 000 - 0 240 Unknown_SSD_Attribute -O---K 100 100 000 - 0 Device Statistics (GP Log 0x04) Page Offset Size Value Description 1 ===== = = == General Statistics (rev 2) == 1 0x008 4 889 Lifetime Power-On Resets 1 0x010 4 2149 Power-on Hours 1 0x018 6 14261031181 Logical Sectors Written 1 0x020 6 171555258 Number of Write Commands 1 0x028 6 11509136348 Logical Sectors Read 1 0x030 6 120292567 Number of Read Commands 4 ===== = = == General Errors Statistics (rev 1) == 4 0x008 4 0 Number of Reported Uncorrectable Errors 4 0x010 4 0 Resets Between Cmd Acceptance and Completion 6 ===== = = == Transport Statistics (rev 1) == 6 0x008 4 0 Number of Hardware Resets 6 0x010 4 0 Number of ASR Events 6 0x018 4 0 Number of Interface CRC Errors 7 ===== = = == Solid State Device Statistics (rev 1) == 7 0x008 1 0~ Percentage Used Endurance Indicator |_ ~ normalized value

fzabkar wrote:@cutemama, ISTM that there may be a bug in Samsung's firmware. Did you become aware of this problem via Windows? If so, does Mac OS alert you to this problem in normal usage, or do you need to run a Mac diagnostic against the drive before you see the problem?

AISI, your results can be interpreted in several ways.

Let's assume that the wear levelling is perfect, ie that all memory cells have recorded the same number of P/E cycles. Let's also assume that write amplification is 1:1.

Then ...

P/E cycles = total number of host writes / capacity = 7TB / 512GB = 13.7

Otherwise ...

total number of drive writes = P/E cycles x capacity = 3000 x 512GB = 1.5 Petabytes

write amplification = drive writes / host writes = 1.5PB / 7TB = 200

Of course none of these results is consistent with the SMART data. This suggests that, either there is a firmware bug, or that our interpretation of the attributes is incorrect.

The other thing that you need to ask is whether it is even possible to write 1.5PB during the drive's lifetime.

Assuming that the drive remained powered on during its entire life ..

(1.5 petabytes) / (2 years) = 24 MB per second = 86 GB / hour

I have both windows 7(400G) and Mac (100G), but occasionally use Mac. I became aware of the problem for the first time by receiving the message 'windows detected a disk problem' in windows. After backup, I did a chkdsk test but nothing wrong. Then in Mac, the "disk utility" showed 'failing' but the disk test for Mac partition was OK. At the moment, it actually hasn't shown any visible problem when runing either windows or Mac.

I'm convinced there is a bug in the firmware. In fact while I was researching your problem I found another, different SMART bug in a different Samsung SSD.

IIUC, you are saying that Apple's OS doesn't alert you to any problem in normal usage, in which case I suspect that there may be many Mac users who have the same problem but are blissfully unaware of it. It might be worth asking for feedback in the Mac forums.

Just FYI, I think I have worked out the meaning of attribute 169, after comparing several Samsung examples.

Code:: 3874748374528 = 0x 0386 2900 1E00 | | | | | Total Reserved Block Count (Chip) ??? | | | Total Reserved Block Count (Used + Unused) | Used Reserved Block Count

This appears to be telling us that the SSD left the factory with 902 (= 0x386) bad blocks.

The total number of reserved blocks is 10496 (= 0x2900, "Total") or 7680 (= 0x1E00, "Chip").

I don't understand the distinction between "Chip" and "Total" Reserved Block Count, but that's how Samsung's documentation refers to these attributes.

Code:: 178 B2 Used Reserved Block Count (Chip) 179 B3 Used Reserved Block Count (Total) 180 B4 Unused Reserved Block Count (Total)

http://web.archive.org/web/201007140712 ... _rev11.pdf

The equivalent 256GB model (SM0256F) has exactly half as many reserved blocks.

fzabkar wrote:
Code:
3874748374528 = 0x 0386 2900 1E00 | | | | | Total Reserved Block Count (Chip) ??? | | | Total Reserved Block Count (Used + Unused) | Used Reserved Block Count

This appears to be telling us that the SSD left the factory with 902 (= 0x386) bad blocks.

Maybe these blocks are reserved by the firmware for something other than, or in addition to, bad blocks???

fzabkar wrote:
fzabkar wrote:
Code:
3874748374528 = 0x 0386 2900 1E00 | | | | | Total Reserved Block Count (Chip) ??? | | | Total Reserved Block Count (Used + Unused) | Used Reserved Block Count

This appears to be telling us that the SSD left the factory with 902 (= 0x386) bad blocks.

Maybe these blocks are reserved by the firmware for something other than, or in addition to, bad blocks???

Will this be helpful?
http://www.samsung.com/global/business/ ... per07.html

Samsung doesn't consistently define its SSD SMART attributes. They can and do vary between models, and even between firmware versions for the same model.

AFAICT, the following interpretations and descriptions apply to your SM0512F model. I have denoted my own interpretations with an asterisk.

Code:: 169 A9 * Reserved Block Count 173 AD * P/E Cycle Count (Max) 174 AE Host_Reads_MiB 175 AF Host_Writes_MiB 177 B1 * P/E Cycle Count (Min) 178 B2 Used Reserved Block Count (Chip) 179 B3 Used Reserved Block Count (Total) 180 B4 Unused Reserved Block Count (Total) 232 E8 * Unused Reserved Block Count (Chip) ???

Samsung's official documentation describes attribute 175/AF as "Program Fail Count (Chip)", but this is clearly incorrect in your case.

In the following example, the read/write statistics in GP Log 0x04 are defined by the ATA standard, not Samsung, so we can work backwards to decode the real meaning of Samsung's SMART attributes.

http://sourceforge.net/p/smartmontools/ ... midt.de/1/

Device Model: APPLE SSD SM0512F
Firmware Version: UXM2JA1Q

Code:: 174 Host_Reads_MiB -O---K 099 099 000 - 2101127 175 Host_Writes_MiB -O---K 099 099 000 - 2185830 Device Statistics (GP Log 0x04) Page Offset Size Value Description 1 0x018 6 4476580560 Logical Sectors Written 1 0x028 6 4303109252 Logical Sectors Read

Here are differences in Samsung's own documentation:

https://code.google.com/p/hddguardian/w ... attributes
http://web.archive.org/web/201007140712 ... _rev11.pdf

Code:: 183 B7 Runtime bad block (Total) 187 BB Uncorrectable Error Count 190 BE Temperature Exceed Count 194 C2 Airflow Temperature

http://webcache.googleusercontent.com/s ... per07.html

Code:: ID # 183 B7 Uncorrectable Error Count ID # 190 BE Air Flow temperature

Many thanks for the infomation. The firmware version for my SSD is UXM2EA1Q, so some attributes might not be the same.

Found another case:
https://discussions.apple.com/message/25708607#25708607

Code:: (1) APPLE SSD SM0512F ---------------------------------------------------------------------------- Model : APPLE SSD SM0512F Firmware : [b]UXM2EA1Q[/b] Serial Number : S18YNYAD810356 Disk Size : 500.2 GB (8.4/137.4/500.2/500.2) Buffer Size : Unknown Queue Depth : 32 # of Sectors : 977105060 Rotation Rate : ---- (SSD) Interface : Serial ATA Major Version : ATA8-ACS Minor Version : ATA8-ACS version 4c Transfer Mode : SATA/600 Power On Hours : 8188 hours Power On Count : 2028 count Temparature : 48 C (118 F) Health Status : Good Features : S.M.A.R.T., AAM, 48bit LBA, NCQ, TRIM APM Level : ---- AAM Level : 8000h [OFF] -- S.M.A.R.T. -------------------------------------------------------------- ID Cur Wor Thr RawValues(6) Attribute Name 01 200 200 __0 000000000000 Read Error Rate 05 100 100 __0 000000000000 Reallocated Sectors Count 09 _98 _98 __0 000000001FFC Power-On Hours 0C _97 _97 __0 0000000007EC Power Cycle Count A9 253 253 _10 039B29001E00 Unknown AD 100 100 100 0B270BCB0B6E Unknown AE _99 _99 __0 000000827C0B Unknown AF _99 _99 __0 000000762F25 Unknown C0 _99 _99 __0 000000000019 Unsafe Shutdown Count C2 _52 _52 __0 004F00050030 Temperature C5 100 100 __0 000000000000 Current Pending Sector Count C7 200 199 __0 000000000000 Unknown F0 100 100 __0 000000000000 Unknown

cutemama wrote:Many thanks for the infomation. The firmware version for my SSD is UXM2EA1Q, so some attributes might not be the same.

Found another case:
https://discussions.apple.com/message/25708607#25708607

Code:
(1) APPLE SSD SM0512F ---------------------------------------------------------------------------- Model : APPLE SSD SM0512F Firmware : [b]UXM2EA1Q[/b] Serial Number : S18YNYAD810356 Disk Size : 500.2 GB (8.4/137.4/500.2/500.2) Buffer Size : Unknown Queue Depth : 32 # of Sectors : 977105060 Rotation Rate : ---- (SSD) Interface : Serial ATA Major Version : ATA8-ACS Minor Version : ATA8-ACS version 4c Transfer Mode : SATA/600 Power On Hours : 8188 hours Power On Count : 2028 count Temparature : 48 C (118 F) Health Status : Good Features : S.M.A.R.T., AAM, 48bit LBA, NCQ, TRIM APM Level : ---- AAM Level : 8000h [OFF] -- S.M.A.R.T. -------------------------------------------------------------- ID Cur Wor Thr RawValues(6) Attribute Name 01 200 200 __0 000000000000 Read Error Rate 05 100 100 __0 000000000000 Reallocated Sectors Count 09 _98 _98 __0 000000001FFC Power-On Hours 0C _97 _97 __0 0000000007EC Power Cycle Count A9 253 253 _10 039B29001E00 Unknown AD 100 100 100 0B270BCB0B6E Unknown AE _99 _99 __0 000000827C0B Unknown AF _99 _99 __0 000000762F25 Unknown C0 _99 _99 __0 000000000019 Unsafe Shutdown Count C2 _52 _52 __0 004F00050030 Temperature C5 100 100 __0 000000000000 Current Pending Sector Count C7 200 199 __0 000000000000 Unknown F0 100 100 __0 000000000000 Unknown

The info given above was from 05/08/2015

Today(10/08/2015) I check it again, now the attribute 173 value drops by one:

Code:: -- S.M.A.R.T. -------------------------------------------------------------- ID Cur Wor Thr RawValues(6) Attribute Name 01 200 200 __0 000000000000 Read Error Rate 05 100 100 __0 000000000000 Reallocated Sectors Count 09 _98 _98 __0 00000000204B Power-On Hours 0C _97 _97 __0 0000000007F7 Power Cycle Count A9 253 253 _10 039B29001E00 Unknown AD _99 _99 100 0B350BDA0B81 Unknown AE _99 _99 __0 00000083379F Unknown AF _99 _99 __0 0000007713CB Unknown C0 _99 _99 __0 000000000019 Unsafe Shutdown Count C2 _45 _45 __0 004F00050037 Temperature C5 100 100 __0 000000000000 Current Pending Sector Count C7 200 199 __0 000000000000 Unknown F0 100 100 __0 000000000000 Unknown

The average wear levelling count increased by 19 (= 0xB81 - 0xB6E) while the host writes increased by 57GiB (= 0x7713cb - 0x762f25 MiB). Clearly that's absurd, so I'm convinced that there is a firmware bug.

Hi,

I have two late 2013 27" iMacs where one has exactly the same problem:

- Win 7 in Bootcamp warned about the SMART error
- Disk Utility in OS X suggests to replace the SSD
- Wear Level Count is allegedly depleted
- around 7TB of data have been written to the SSD
- all other SMART values/tests show no problem

The second one (which doesn't get used that often) shows around 3,5TB written to disk and around 65% of wear levelling gone. I think the error here is imminent.

Has anyone tried to open a ticket at Samsung for this?

Cheers,

Johannes

Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Cloning

Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Cloning

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin

Re: Macbook Air 2013 SMART Failing SSD - OSX/Bootcamp Clonin