Hi all,
Hope this is the right are to post in - if not, feel free to move this post.
My hard disk in my laptop suddenly died; one minute I was happily reading files on it, the next I got "I/O errors", and once I shut everything down and restarted, the laptop wouldn't even boot with the drive inside; more detail about that in
https://superuser.com/questions/1263596/troubleshooting-failing-disk-hardware-no-medium-present-on-linux.
As written in that post, after this event, Linux would detect this drive only as:
Code:
Oct 29 21:48:29 mypc kernel: [ 1363.961212] scsi 4:0:0:0: Direct-Access osz osz osz osz osz osz AD04 PQ: 0 ANSI: 6
Oct 29 21:48:29 mypc kernel: [ 1363.964557] sd 4:0:0:0: [sdb] Attached SCSI removable disk
Oct 29 21:48:29 mypc kernel: [ 1363.964978] sd 4:0:0:0: Attached scsi generic sg1 type 0
... and `mount`, `fdisk -l`, or `parted -l` wouldn't even see this drive, the only tools that would see it were:
Code:
$ sudo lshw -class disk -class storage -short
H/W path Device Class Description
===================================================
/0/100/1f.2 storage 7 Series Chipset Family 6-port SATA Controller [A
/0/2 scsi0 storage
/0/2/0.0.0 /dev/sda disk 128GB SanDisk SSD U100
/0/3 scsi4 storage
/0/3/0.0.0 /dev/sdb disk osz osz osz osz
/0/3/0.0.0/0 /dev/sdb disk
$ sudo smartctl --all /dev/sdb
smartctl 6.2 2013-07-26 r3841 [i686-linux-4.4.0-57-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/sdb: Unknown USB bridge [0x05e3:0x0735 (0x4104)]
Please specify device type with the -d option.
Use smartctl -h to get a usage summary
$ sudo smartctl --all -d scsi /dev/sdb
smartctl 6.2 2013-07-26 r3841 [i686-linux-4.4.0-57-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: osz osz
Product: osz osz osz osz
Revision: AD04
Logical block provisioning type unreported, LBPME=-1, LBPRZ=0
Device type: disk
Local Time is: Sun Oct 29 22:25:01 2017 CET
NO MEDIUM present on device
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
So, I tried to do some data recovery via Ultimate Boot CD, however, also Clonezilla and the like failed to detect the drive.
So I ended up trying MHDD, and these are the results so far. First of, I did this on a spare laptop I have, which does have a functioning primary hard disk. So I first ran MHDD (via Ultimate Boot CD, booted from USB stick) on this drive, to see how it behaves.
At start I get this "PCI scan" screen:
(getting "It was not possible to determine the dimensions of the image." if I use img bbcode, so here's the image link:
https://imgur.com/NQq36di )
It shows two Intel... somethings, I guess those are bus controllers? Anyways, soon after, that screen disappears, and I get this:
https://imgur.com/vHPHsBIThe working drive (Fujitsu) is shown under option 6, under "PCI controllers". So, I choose that, and then I issue commands ID and EID:
https://imgur.com/h4O7V07So far so good, id information is listed, as expected.
Now, I turn everything off, and insert the dead drive - a Hitachi Z5K500-500 - instead of the working one, and repeat the procedure. The first screen is the same; but on the second screen:
https://imgur.com/OvAi784... there is nothing under option 6 under "PCI controllers". I choose it anyway, and try to issue ID/EID:
https://imgur.com/MiX4fUT... and I get "Device not ready", and the "ERR" flag is marked red. The manual
http://hddguru.com/software/2005.10.02-MHDD/mhdd_manual.en.html notes:
Quote:
When you see ERROR flag (ERR) you can look at the error register where you can see what kind of error happened. See ATA/ATAPI standard for more information about registers and commands.
... unfortunately, I don't really understand how I can look at the error register - can anyone help?
Also, before sticking the dead drive in the bay of the laptop (so it is attached directly via SATA), I tried attaching it through a USB/SATA enclosure, and during boot of (I guess) FreeDOS from Ultimate Boot CD, and chose to install USB drivers, I'd get errors like:
Code:
LBACACHE flush: write error.2080/LBA#0101
dos mem corrupt, first_mcb=028a PANIC: MCB chain corrupted, system halted
Only reference to something like this, I found on
http://freedos.10956.n7.nabble.com/LBACache-error-message-Is-there-retry-td11896.html, which says:
Quote:
LBACache flush: write error.0880/LBA#0101
Explanation by Eric is:
E> means: DMA overrun (error
on drive 80 (first harddisk),
E> 1 of 1 sectors successfully written (but LBAcache does not
E> believe that, because it got the DMA overrun error status,
E> so it assumes that the write did not work and, to be safe,
E> forgets all cached data for that drive / restarts itself
E> for that drive)...
... however, I have "write error.2080", not "write error.0880".
Ironically, most of this drive is some open source builds, so I'd otherwise throw it away and not worry about it - except there are a few files (some spreadsheets regarding my salary) that I'd really need. So I'd like to ensure that there is really nothing I can do myself, before I consider sending this to a data recovery center. Even further, I'd like to know - if possible - what went wrong on this drive (I've had disk failures before, but they were either caused by something obvious, like the power being cut down, or they'd fail loudly, so you could hear the needle digging itself into the platters; but this time, it happened from one moment to the next, nothing obvious, and no weird sounds either).
So, is there anything else, troubleshooting wise, that I could do with free tools, to at least find out what is wrong on this disk (maybe it is some controller that failed, and not the magnetic media)?
Thanks in advance!