MultiDrive – free backup, clone & wipe disk utility from Atola Technology

All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 10 posts ] 

Who's the culprit?
The BIOS, it's old, you're doomed. 0%  0%  [ 0 ]
Oh those WDs are known for bug xyz in their FW. You're doomed. 0%  0%  [ 0 ]
It's linux, it must be! You're doomed. 0%  0%  [ 0 ]
It's your fault. You're doomed. 100%  100%  [ 1 ]
Other 0%  0%  [ 0 ]
Total votes : 1
Author Message
 Post subject: WD Reds fail to IDENTIFY at boot
PostPosted: February 9th, 2015, 17:30 
Offline

Joined: February 9th, 2015, 17:12
Posts: 5
Location: here, really
You guys seem to be my last chance. How guru are you? :wink:

I have 2 Reds. Sometimes one fails, sometimes the other, some - rare - times both. Other times they both work.

My hardware:
* Asus M2nPV-VM motherboard - BIOS version 5005 (this was the last version ever released)
* Phenom II X4 910e CPU
* 8GiB DDR2 RAM (dual-channel 800MHz)
* 4 SATA drives:
** 2 Toshiba DTACA100 (ports SATA1 and 3)
** 2 WD Reds WD10EFRX (ports SATA2 and 4)
* ATI/AMD Radeon HD 5450 on a PCIe slot

The symptoms:
* the BIOS will show a drive's data as Cylinder: 4095, Head: 240, Landing Zone: 65534, Sector 255
* the sector count will be 268435455 instead of 1953525168
* LBA instead of LBA48

An example from dmesg (on a Debian wheezy, attached):
[ 1.025094] ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xf308 irq 20
[ 2.464068] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 2.473058] ata4.00: ATA-9: WDC WD10EFRX-68PJCN0, 01.01A01, max UDMA/133
[ 2.473061] ata4.00: 1953525168 sectors, multi 1: LBA48 NCQ (depth 31/32)
[ 2.481046] ata4.00: n_sectors mismatch 1953525168 != 268435455
[ 2.481048] ata4.00: revalidation failed (errno=-19)
[ 2.481094] ata4: limiting SATA link speed to 1.5 Gbps
[ 2.481097] ata4.00: limiting speed to UDMA/133:PIO3
[ 7.932494] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 7.940661] ata4.00: n_sectors mismatch 1953525168 != 268435455
[ 7.940664] ata4.00: revalidation failed (errno=-19)
[ 7.940707] ata4.00: disabled

Forcing the SATA driver to rescan (echo - - - > /sys/class/scsi_host/hostX/scan) does the trick:
[ 169.394406] ata4: hard resetting link
[ 170.272371] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 170.281149] ata4.00: ATA-9: WDC WD10EFRX-68PJCN0, 01.01A01, max UDMA/133
[ 170.281159] ata4.00: 1953525168 sectors, multi 1: LBA48 NCQ (depth 31/32)
[ 170.288893] ata4.00: configured for UDMA/133
[ 170.288910] ata4: EH complete

What i've tried:
* switching cables and ports
* activating large drive in the BIOS (for 4KiB isntead of the 512 emulation) and other BIOS toggles
* activating PUIS on one of the drives to troubleshoot a possible PSU spike on boot or something
* enabling (but not using) the BIOS NVRAID feature (to eventually set the driver into AHCI mode)
* SMART tests are ok

What i haven't done yet:
* set /boot outside the RAID1 that's on the Toshibas - a /boot inside an LVM inside a software RAID could apparently confuse the BIOS, but the system has always booted. Besides, the WDs are empty and not bootable, they're data disks.
* sell the WDs, get another DCACA100, set-up a RAID5 instead of a RAID1 and a ZFS VOL.

The Toshibas have never failed to boot.

The folks at linux-ide have contributed some suggestions, WD support was useless and the WD comunity forums guys suggested what seems to be the most likely cause: the BIOS and the WD firmware don't get along. Either the BIOS is buggy - and i'm outta luck - or the WD FW is; or neither is and i don't know what else to do. Enter the hddguru people. :)

Any suggestions? My current workaround is to hack a script to count enabled disks on boot, which i'd rather avoid.


Attachments:
2015-02-09.txt [51.94 KiB]
Downloaded 912 times
Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 9th, 2015, 23:54 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
When the problem occurs, does the drive have a HPA?

See http://en.wikipedia.org/wiki/Host_prote ... on_methods

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 10th, 2015, 18:14 
Offline

Joined: February 9th, 2015, 17:12
Posts: 5
Location: here, really
There's nothing referring HPAs in dmesg on both failed and successful boots. As far as hdparm goes,

# hdparm -N /dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-SERIAL1

/dev/disk/by-id/ata-WDC_WD10EFRX-68PJCN0_WD-SERIAL1:
max sectors = 1953525168/1953525168, HPA is disabled

This was a clean boot, all disks were detected.
From hdparm -I:

Commands/features:
Enabled Supported:
* Host Protected Area feature set

So it does support it. I'll check if it's anything there next time it fails.
However, if the drive fails to boot it won't show up in my /dev. I guess i'll find out if hdparm can access the drive regardless.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 10th, 2015, 18:35 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
Can you use dd to copy 4000 sectors, say, from the end of each drive to a file? If it is a BIOS issue, then we may see BIOS related text strings, which would in turn suggest that BIOS is writing a backup of itself to the end of the drive.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 12th, 2015, 19:22 
Offline

Joined: February 9th, 2015, 17:12
Posts: 5
Location: here, really
Yup, if libata kicks the drive it won't be in /dev so there's no way for hdparm to find it.

fzabkar wrote:
Can you use dd to copy 4000 sectors, say, from the end of each drive to a file? If it is a BIOS issue, then we may see BIOS related text strings, which would in turn suggest that BIOS is writing a backup of itself to the end of the drive.


This is becoming interesting, i never thought of what lies beyond the addressable space...
From hdparm -I:
Code:
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors: 1953525168
        Logical  Sector size:                   512 bytes
        Physical Sector size:                  4096 bytes
        Logical Sector-0 offset:                  0 bytes
        device size with M = 1024*1024:      953869 MBytes
        device size with M = 1000*1000:     1000204 MBytes (1000 GB)
        cache/buffer size  = unknown
        Nominal Media Rotation Rate: 5400

Ha, there's "intellispeed" for you... Anyway, since drive manufacturers use SI, i went with this:

Code:
# dd if=/dev/disk/by-id/ata-WD2 of=/root/end.img count=512 bs=1MB skip=1000000
204+1 records in
204+1 records out
204886016 bytes (205 MB) copied, 2.7828 s, 73.6 MB/s

I was expecting an EndOfDisk or something. Am i to assume there are an extra 205MB of space beyond the declared 1TB of drive size?
Then i had fun with hexdump -C. At the very begining i got this:
Code:
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0ba32fd0
(from hexdump's man page the * means it just repeats, so empy stuff) Then
from 0ba32fd0 to 0ba33580 it's ZFS data from my experiments with zpools (the drives have been failling way before this, and i only recently have partitioned them)
from 0ba33590 to 0ba335a0 again empty space
from 0ba4efd0 to 0ba72fd0 random stuff that may or may not be garbage, like:

Code:
0ba4efd0  00 00 00 00 00 00 00 00  11 7a 0c b1 7a da 10 02  |.........z..z...|
0ba4efe0  04 f7 57 ed 73 55 a6 7f  62 0d 63 9a 40 00 c7 d3  |..W.sU..b.c.@...|
0ba4eff0  69 c9 ee 27 83 d9 03 0d  18 78 d5 9a 7d 0d 0e 62  |i..'.....x..}..b|
0ba4f000  00 00 00 00 00 00 00 00  88 13 00 00 00 00 00 00  |................|
0ba4f010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

from 0ba72fd0 to 0ba73580 again ZFS data which seems to be a repetition of the first part
from 0ba73590 to 0c361230 more empty/random data

And then the last bit:

Code:
0c361230  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0c364e00  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
0c364e10  03 44 1b 31 00 00 00 00  af 6d 70 74 00 00 00 00  |.D.1.....mpt....|
0c364e20  01 00 00 00 00 00 00 00  22 00 00 00 00 00 00 00  |........".......|
0c364e30  8e 6d 70 74 00 00 00 00  67 40 84 50 1c 19 42 8d  |.mpt....g@.P..B.|
0c364e40  81 4d 3d 5f 5c 44 88 af  8f 6d 70 74 00 00 00 00  |.M=_\D...mpt....|
0c364e50  80 00 00 00 80 00 00 00  54 02 a2 a4 00 00 00 00  |........T.......|
0c364e60  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
0c365000

And for the sake of completeness, the same last bit from the other WD:
Code:
0c364e00  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
0c364e10  cd 19 cb 1a 00 00 00 00  af 6d 70 74 00 00 00 00  |.........mpt....|
0c364e20  01 00 00 00 00 00 00 00  22 00 00 00 00 00 00 00  |........".......|
0c364e30  8e 6d 70 74 00 00 00 00  df f0 b1 ed 0a ef 49 47  |.mpt..........IG|
0c364e40  93 e2 be 5c 87 bf 7c db  8f 6d 70 74 00 00 00 00  |...\..|..mpt....|
0c364e50  80 00 00 00 80 00 00 00  00 ac 06 91 00 00 00 00  |................|
0c364e60  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

I was exited to see this, but this may be GRUB's doing, not the BIOS's, dunno. Then again GRUB writes to the MBR, not the ass-end of the drive...
I can pm you the full dump if you want.

However, neither drive is even flaged as bootable, these are data drives only. The Toshibas are the bootable ones with the OS, so the BIOS shouldn't be messing with the WDs (which doesn't mean it isn't).


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 13th, 2015, 13:38 
Offline

Joined: December 14th, 2011, 8:24
Posts: 60
Location: Cyberspace
EFI PART is the backup copy of GPT header which is supposed to be in, well, the last physical sector of the drive.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 13th, 2015, 17:12 
Offline

Joined: February 9th, 2015, 17:12
Posts: 5
Location: here, really
ReclaiMe wrote:
EFI PART is the backup copy of GPT header which is supposed to be in, well, the last physical sector of the drive.


So the BIOS is not writing to the end of the disks.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 13th, 2015, 17:27 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
onetimeposter wrote:
ReclaiMe wrote:
EFI PART is the backup copy of GPT header which is supposed to be in, well, the last physical sector of the drive.


So the BIOS is not writing to the end of the disks.

It doesn't look like it. ISTM that you may have some other problem.

_________________
A backup a day keeps DR away.


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 17th, 2015, 8:41 
Offline

Joined: February 9th, 2015, 17:12
Posts: 5
Location: here, really
fzabkar wrote:
onetimeposter wrote:
ReclaiMe wrote:
EFI PART is the backup copy of GPT header which is supposed to be in, well, the last physical sector of the drive.


So the BIOS is not writing to the end of the disks.

It doesn't look like it. ISTM that you may have some other problem.


Well, as a wrap-up, what would the BIOS dump at the end of the drive? What would it look like under a hexdump? Would there be a string with the manufacture's name?


Top
 Profile  
 
 Post subject: Re: WD Reds fail to IDENTIFY at boot
PostPosted: February 17th, 2015, 17:09 
Offline
User avatar

Joined: September 8th, 2009, 18:21
Posts: 16960
Location: Australia
Here is your BIOS:

http://dlcdnet.asus.com/pub/ASUS/mb/soc ... M/5005.zip

When you examine the 5005.bin file with a hex editor, you will see text strings such as the following. If your BIOS were being backed up to your drive, then you would find those same strings, after exposing the Host Protected Area (HPA).

Code:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  21 10 2D 6C 68 35 2D C0 49 01 00 00 00 02 00 00  !.-lh5-ÀI.......
00000010  00 00 50 20 01 08 35 30 30 35 2E 42 49 4E C4 73  ..P ..5005.BINÄs
........
0007E000  41 77 61 72 64 20 42 6F 6F 74 42 6C 6F 63 6B 20  Award BootBlock
0007E010  42 49 4F 53 20 76 31 2E 30 0D 0A 43 6F 70 79 72  BIOS v1.0..Copyr
0007E020  69 67 68 74 20 28 63 29 20 32 30 30 30 2C 20 41  ight (c) 2000, A
0007E030  77 61 72 64 20 53 6F 66 74 77 61 72 65 2C 20 49  ward Software, I
0007E040  6E 63 2E 0D 0A 2A 42 42 53 53 2A 00 00 00 00 00  nc...*BBSS*.....
........
0007E100  EA 34 E5 00 F0 44 72 69 76 65 20 41 20 65 72 72  ê4å.ðDrive A err
0007E110  6F 72 2E 2E 2E 0D 0A 44 49 53 4B 20 42 4F 4F 54  or.....DISK BOOT
0007E120  20 46 41 49 4C 55 52 45 2C 20 49 4E 53 45 52 54   FAILURE, INSERT
0007E130  20 53 59 53 54 45 4D 20 44 49 53 4B 20 41 4E 44   SYSTEM DISK AND
0007E140  20 50 52 45 53 53 20 45 4E 54 45 52 42 49 4F 53   PRESS ENTERBIOS
0007E150  20 52 4F 4D 20 63 68 65 63 6B 73 75 6D 20 65 72   ROM checksum er
0007E160  72 6F 72 4B 65 79 62 6F 61 72 64 20 63 6F 6E 74  rorKeyboard cont
0007E170  72 6F 6C 6C 65 72 20 65 72 72 6F 72 4B 65 79 62  roller errorKeyb
0007E180  6F 61 72 64 20 65 72 72 6F 72 20 6F 72 20 6E 6F  oard error or no
0007E190  20 6B 65 79 62 6F 61 72 64 20 70 72 65 73 65 6E   keyboard presen
0007E1A0  74 0D 0A 44 65 74 65 63 74 69 6E 67 20 66 6C 6F  t..Detecting flo
0007E1B0  70 70 79 20 64 72 69 76 65 20 41 20 6D 65 64 69  ppy drive A medi
0007E1C0  61 2E 2E 2E 44 72 69 76 65 20 6D 65 64 69 61 20  a...Drive media
0007E1D0  69 73 20 3A 20 31 2E 34 34 4D 62 0D 0A 31 2E 32  is : 1.44Mb..1.2
0007E1E0  4D 62 20 0D 0A 37 32 30 4B 62 20 0D 0A 33 36 30  Mb ..720Kb ..360
0007E1F0  4B 62 20 0D 0A 00 00 00 00 00 00 00 00 00 00 00  Kb .............
........
0007FFC0  24 41 53 55 53 41 57 41 52 44 24 05 00 00 00 00  $ASUSAWARD$.....
0007FFD0  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0007FFE0  00 1E FF FF 00 1C FF FF 4D 32 4E 50 56 2D 56 4D  ..ÿÿ..ÿÿM2NPV-VM

_________________
A backup a day keeps DR away.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 34 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group