View topic - WD RED WD80EFAX failed, but can PCB replacement fix it?

Main » Forums home » Conventional hard drives

All times are UTC - 5 hours [ DST ]

WD RED WD80EFAX failed, but can PCB replacement fix it?

Page 1 of 1

[ 5 posts ]

Previous topic | Next topic

Author

Message

cromo

Post subject: WD RED WD80EFAX failed, but can PCB replacement fix it?

Posted: December 7th, 2023, 16:07

Joined: December 7th, 2023, 15:53
Posts: 2
Location: Poland

Hi there,

I have discussed this issue extensively on ServeTheHome forums already (here), and was pointed to your direction as a more specialized bunch.

Here goes the summary of the issue:

My WD RED WD80EFAX HDD suddenly died few weeks ago week: I shut down my Proxmox server, booted it up again and the drive started "clicking". It was clicking for a while, until it stopped and no longer does that. I did not receive any SMART warnings ahead of time, and looking back at the /var/lib/smartmontools/ attrlog, I don’t think there was anything to worry about there (couldn't copy'n'paste the table here, see the original post at ServeTheHome for reference if needed).

The HDD was connected through an external USB enclosure, so I first tested to make sure the problem persists using another USB enclosure and it does, unfortunately. What I was seeing in dmesg was:

Code:

[25343.421737] usb 2-3: new SuperSpeed USB device number 8 using xhci_hcd
[25343.442848] usb 2-3: New USB device found, idVendor=152d, idProduct=1561, bcdDevice= 1.04
[25343.442854] usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[25343.442857] usb 2-3: Product: SABRENT
[25343.442858] usb 2-3: Manufacturer: SABRENT
[25343.442860] usb 2-3: SerialNumber: DB98765432143
[25343.446053] scsi host1: uas
[25343.446591] scsi 1:0:0:0: Direct-Access     SABRENT                   0104 PQ: 0 ANSI: 6
[25343.448532] sd 1:0:0:0: Attached scsi generic sg0 type 0
[25353.377987] sd 1:0:0:0: [sda] 1953506646 4096-byte logical blocks: (8.00 TB/7.28 TiB)
[25353.378144] sd 1:0:0:0: [sda] Write Protect is off
[25353.378147] sd 1:0:0:0: [sda] Mode Sense: 53 00 00 08
[25353.378427] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[25353.378658] sd 1:0:0:0: [sda] Preferred minimum I/O size 32768 bytes
[25353.378662] sd 1:0:0:0: [sda] Optimal transfer size 268431360 bytes not a multiple of preferred minimum block size (32768 bytes)
[25384.996385] sd 1:0:0:0: [sda] tag#22 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[25384.996393] sd 1:0:0:0: [sda] tag#22 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[25385.016413] scsi host1: uas_eh_device_reset_handler start
[25385.148590] usb 2-3: reset SuperSpeed USB device number 8 using xhci_hcd
[25385.174465] scsi host1: uas_eh_device_reset_handler success
[25417.783354] scsi host1: uas_eh_device_reset_handler start
[25417.783528] sd 1:0:0:0: [sda] tag#24 uas_zap_pending 0 uas-tag 1 inflight: CMD
[25417.783535] sd 1:0:0:0: [sda] tag#24 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[25417.915763] usb 2-3: reset SuperSpeed USB device number 8 using xhci_hcd
[25417.937381] scsi host1: uas_eh_device_reset_handler success
[25450.530389] scsi host1: uas_eh_device_reset_handler start
[25450.530552] sd 1:0:0:0: [sda] tag#26 uas_zap_pending 0 uas-tag 1 inflight: CMD
[25450.530556] sd 1:0:0:0: [sda] tag#26 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[25450.658774] usb 2-3: reset SuperSpeed USB device number 8 using xhci_hcd
[25450.680523] scsi host1: uas_eh_device_reset_handler success
[25453.039632] sd 1:0:0:0: [sda] tag#9 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=99s
[25453.039639] sd 1:0:0:0: [sda] tag#9 Sense Key : Aborted Command [current]
[25453.039641] sd 1:0:0:0: [sda] tag#9 Add. Sense: No additional sense information
[25453.039644] sd 1:0:0:0: [sda] tag#9 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[25453.039646] I/O error, dev sda, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[25453.039650] Buffer I/O error on dev sda, logical block 0, async page read
[25483.301277] sd 1:0:0:0: [sda] tag#10 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[25483.301299] sd 1:0:0:0: [sda] tag#10 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[25483.345279] scsi host1: uas_eh_device_reset_handler start
[25483.477571] usb 2-3: reset SuperSpeed USB device number 8 using xhci_hcd
[25483.499402] scsi host1: uas_eh_device_reset_handler success

While the disk appears to report the capacity (7.28 TiB), I cannot get smartct to show anything at all: when connected via USB bridge, issuing -c, i or -a ends up with smartcul hung up. The disk does, however, "ticks" rhythmically and rather quietly during when smartctl remains stuck, yet it is not the louder "clicking" sound.

Additionally, I tested with another system to connect the drive directly via SATA and now am getting even different errors, which I thought could be another evidence for the electronics failing. Correct me if I am wrong, but errors such as "failed to enable AA", " Read log 0x00 page 0x00 failed", etc. suggest there's a communication error with the disk? Or would these also appear when the disk fails to e.g. operate the heads to read from the HPA?

Code:

2023-11-28T20:36:43.823710+01:00 proxmox kernel: [ 1687.533870] ata6: link is slow to respond, please be patient (ready=0)
2023-11-28T20:36:48.059737+01:00 proxmox kernel: [ 1691.769848] ata6: COMRESET failed (errno=-16)
2023-11-28T20:36:49.027742+01:00 proxmox kernel: [ 1692.737841] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2023-11-28T20:36:49.027770+01:00 proxmox kernel: [ 1692.739092] ata6.00: failed to read native max address (err_mask=0x100)
2023-11-28T20:36:49.027773+01:00 proxmox kernel: [ 1692.739769] ata6.00: HPA support seems broken, skipping HPA handling
2023-11-28T20:36:54.607898+01:00 proxmox kernel: [ 1698.317931] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2023-11-28T20:36:54.607905+01:00 proxmox kernel: [ 1698.318902] ata6.00: ATA-9: WDC WD80EFAX-68LHPN0, 83.H0A83, max UDMA/133
2023-11-28T20:36:54.607906+01:00 proxmox kernel: [ 1698.319966] ata6.00: failed to enable AA (error_mask=0x1)
2023-11-28T20:36:54.611989+01:00 proxmox kernel: [ 1698.322029] ata6.00: Read log 0x00 page 0x00 failed, Emask 0x1
2023-11-28T20:36:54.611993+01:00 proxmox kernel: [ 1698.322786] ata6.00: NCQ Send/Recv Log not supported
2023-11-28T20:36:54.611994+01:00 proxmox kernel: [ 1698.323423] ata6.00: Read log 0x00 page 0x00 failed, Emask 0x40
2023-11-28T20:36:54.611994+01:00 proxmox kernel: [ 1698.324056] ata6.00: NCQ Send/Recv Log not supported
2023-11-28T20:36:54.611994+01:00 proxmox kernel: [ 1698.324707] ata6.00: Read log 0x00 page 0x00 failed, Emask 0x40
2023-11-28T20:36:54.611995+01:00 proxmox kernel: [ 1698.325353] ata6.00: 15628053168 sectors, multi 0: LBA48 NCQ (depth 32)
2023-11-28T20:36:54.611995+01:00 proxmox kernel: [ 1698.326043] ata6.00: failed to set xfermode (err_mask=0x40)
2023-11-28T20:36:54.615803+01:00 proxmox kernel: [ 1698.326712] ata6: limiting SATA link speed to 3.0 Gbps
2023-11-28T20:36:54.615824+01:00 proxmox kernel: [ 1698.327359] ata6.00: limiting speed to UDMA/133:PIO3
2023-11-28T20:37:00.063709+01:00 proxmox kernel: [ 1703.776061] ata6: SATA link down (SStatus 0 SControl 320)
2023-11-28T20:37:00.063743+01:00 proxmox kernel: [ 1703.776851] ata6.00: disable device
2023-11-28T20:37:00.803705+01:00 proxmox kernel: [ 1704.513792] ata6: SATA link down (SStatus 0 SControl 300)

Also, when connected directly like that, the system simply stops attempting at initializing the disk (ata6.00: disable device), so there's no access to SMART at all.

Lastly, I removed the PCB and did not see any immediate damage to it. I also cleaned it up a bit, including the spring joints, but that didn't do anything.

At this point I am wondering if these SATA link errors could be indicative of a PCB failure? It would be an odd one, since the disk *does* spin up and partially reports itself (the model, at least). I am asking because I am not sure if I should go through that effort of replacing the PC. The donor PCBs for this model are readily available on Aliexpress at a reasonable price, but it would take quite some work to re-solder the BIOS SMD chip.

Top

WebClaw

Post subject: Re: WD RED WD80EFAX failed, but can PCB replacement fix it?

Posted: December 8th, 2023, 10:59

Joined: November 24th, 2011, 21:48
Posts: 105
Location: Canada

Not a PCB issue. Likely an issue with heads (HDD "ticks" rhythmically"). Drive cannot fully init. / read firmware. Hence why you don't have SMART values or I/O access.

Top

cromo

Post subject: Re: WD RED WD80EFAX failed, but can PCB replacement fix it?

Posted: December 8th, 2023, 12:56

Joined: December 7th, 2023, 15:53
Posts: 2
Location: Poland

Is there any low-skill fix I can try to revive it, even if for momentarily to download the data? I understand that freezing the drive is not applicable here since the drive spins up just fine? Or could it also fix issue with heads? And what if it required opening the disk?

I imagine these questions might be asked a lot, a simple link to a trusted source on the recommended approaches or any other sort of guideline would be much appreciated.

Note that there is nothing critically important on that drive, I would be doing this mostly for fun and learning.

Top

fzabkar

Post subject: Re: WD RED WD80EFAX failed, but can PCB replacement fix it?

Posted: December 8th, 2023, 12:56

Joined: September 8th, 2009, 18:21
Posts: 15610
Location: Australia

This is a HGST model with new firmware architecture. I believe very few people have successfully performed a head swap in these drives.

_________________
A backup a day keeps DR away.

Top

WebClaw

Post subject: Re: WD RED WD80EFAX failed, but can PCB replacement fix it?

Posted: December 8th, 2023, 23:05

Joined: November 24th, 2011, 21:48
Posts: 105
Location: Canada

The complexity of a modern HDD is much more than it was ~20 years ago. The process of just replacing mechanical parts rarely results in data being recovered. As fzabkar stated, that drive is not supported on many "go-to" tools used by data recovery professionals.

Simply put it's not a DYI recovery candidate. Opening the drive will surely solidify data loss and just increase data recovery costs provided by a professional with the proper tools and exp.

Sorry bud.

Top

Page 1 of 1

[ 5 posts ]

Main » Forums home » Conventional hard drives

All times are UTC - 5 hours [ DST ]

Who is online

Users browsing this forum: Google Adsense [Bot], Maestro and 19 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum