Hi,
I have had good experiences purchasing used HGST SAS drives off eBay, in my machine, and recently got a new one (same type HUH728080AL5200) to add to the RAID (software Raid, LSI SAS2008).
My host OS of choice is Unraid, and part of the process of adding a new drive that I go through is to 0 it out. Not strictly necessary, but I appreciate the slight tradeoff that I'm sacrificing and getting a 'feel' for the disk.
The script that 0s out the drive essentially achieves this by doing:
Code:
dd if=/dev/zeroes of=$disk >/dev/null
On my recent disk, this led to meany failed sectors, and a low write speed of 1MB/sec and I was confused as I hadn't experienced this with the other disks.
Trying to determine where this could be coming from, I came across a reddit post from 2018 that helped me nail down the error (I'm guessing? I don't just want to go dd'ing the drives I have in use to confirm, tbh)
https://www.reddit.com/r/homelab/commen ... d_i_screw/While the OP on reddit posted a thorough explanation of how they found the error, they unfortunately never followed up with a fix. Either way, here I replicated that writing to the HD in question fails when using SCSI Write (`sg_write_buffer` -- writes as the kernel would, vs `sg_write_verify` which issues the SCSI mode WRITE AND VERIFY command)
Code:
root@Tower:~# sg_write_buffer --in /dev/urandom --length 512 -vv /dev/sdb
open /dev/sdb with flags=0x802
sending single write buffer, mode=0x0, mpsec=0, id=0, offset=0, len=512
Write buffer cdb: [3b 00 00 00 00 00 00 02 00 00]
Write buffer parameter list (first 256 bytes):
d9 2f 95 44 2e fd d2 47 d2 59 c2 6d 9c a3 04 08
86 70 b9 5e 9b 6d 1c bc 9d ee 54 3d 7a be a6 2b
d7 34 75 7b dd 8a 59 d6 6d bd ea 0b 74 ff 12 84
01 24 21 58 e8 a2 4f 31 0e 68 33 ea a8 1d 16 3d
17 b6 4c b1 32 ce 29 a7 81 a6 15 09 ce c2 85 bb
f2 fd 45 8e 79 73 49 02 cf 5c 47 a2 68 9f 09 26
54 86 0a 05 49 db 0e 99 22 9e f3 da b8 91 8f 4f
e7 be 3f be ef 6e 2b 26 ff 0d bf e8 a2 47 d0 05
a9 fc b3 80 3a a4 e8 50 35 e9 c1 4b 8c ce d2 1f
17 c2 f1 dd 67 dc 65 9b 83 5f e7 37 85 eb dd 1a
68 67 a0 7a ac 56 b3 32 5f 33 af 7b 16 3e cd 03
92 43 84 a2 8a 59 ce 54 d2 a9 83 d5 53 51 11 56
34 c9 41 be 52 71 46 5f 4e f6 db 0b 7e 48 9b 7d
c8 c4 83 61 2c fb ba 37 26 5d e4 0c 64 a5 78 6c
28 5a eb 2e 92 58 99 39 6e d2 96 bd 42 1d a1 0f
42 db 84 6c 3f 5c 86 f7 59 2a 21 10 a5 2d 76 29
Write buffer:
Descriptor format, current; Sense key: Illegal Request
Additional sense: Invalid field in parameter list
Descriptor type: Information: Valid=0 (-> vendor specific) 0x0000000000000000
Descriptor type: Command specific: 0x0000000000000000
Descriptor type: Sense key specific: Field pointer:
Error in Data parameters: byte 0
Descriptor type: Field replaceable unit code: 0x0
Descriptor type: Block commands: Incorrect Length Indicator (ILI) clear
Descriptor type: Vendor specific [0x80]
f8 Write buffer failed: Illegal request, type: sense key, apart from Invalid opcode
`sg_write_verify` meanwhile works fine "ad infinitum":
Code:
root@Tower:~# sg_write_verify --ilen 4096 --in /dev/urandom --lba 0 -R -b 1 -vv --num=8 /dev/sdb | head
open /dev/sdb with flags=0x802
Issue Write and verify(10) to device /dev/sdb
ilen=4096 [0x1000], lba=0 [0x0]
wrprotect=0, dpo=0, bytchk=1, group=0, repeat=1
Write and verify(10) cdb: [2e 02 00 00 00 00 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 08 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 10 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 18 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 20 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 28 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 30 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 38 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 40 00 00 08 00]
Subsequent read from /dev/urandom got 4096 bytes
Write and verify(10) cdb: [2e 02 00 00 00 48 00 00 08 00]
So, I'm a bit at a loss what the actual error could be.
Was it not properly recognised by the onboard controller? Possibly, something I could still check?
Cable Problem? I think I can exclude that, as I've swapped it out with a different disk of the same type, and that works fine.
I've called the HGST helpdesk, who have been helpful and want to make sure the drive works (using their smart checking software, Windows only*), but pointed out several times during the call that 'linux is not supported'. Following the train of thought of the Redditor I'm inclined to think that it's either an error in the recognition on the side of the LSI, or some OEM firmware issue. FWIW, the firmware version is identical throughout all HGST drives in my machine (A907):
Code:
root@Tower:~# lsscsi -sig | grep HUH728080
[8:0:0:0] disk HGST HUH728080AL5200 A907 /dev/sdb 35000cca23b844a3c /dev/sg1 8.00TB
[8:0:1:0] disk HGST HUH728080AL5200 A907 /dev/sdc 35000cca23b9c6c54 /dev/sg2 8.00TB
[8:0:2:0] disk HGST HUH728080AL5200 A907 /dev/sde 35000cca260a90580 /dev/sg4 8.00TB
[8:0:3:0] disk HGST HUH728080AL5200 A907 /dev/sdg 35000cca260aefbec /dev/sg6 8.00TB
[8:0:7:0] disk HGST HUH728080AL5200 A907 /dev/sdk 35000cca2541419d0 /dev/sg10 8.00TB
Any help would be greatly appreciated.