All times are UTC - 5 hours [ DST ]




Post new topic Reply to topic  [ 6 posts ] 
Author Message
 Post subject: Very slow disk speed [noob] [Linux]
PostPosted: December 13th, 2021, 13:33 
Offline

Joined: December 13th, 2021, 12:55
Posts: 3
Location: Earth
Hi,

I have a Western Digital Elements / My Passport (USB, AF) 2.5'' external USB drive I use for backups.
It was working fine before. I finished my backup, unmounted the disk, unplugged the usb cable, put in my backback, and hit the road. When I got home and went to use the disk again I noticed the performance issues.
The drive did not suffer any bump or drop. I'm wondering if when I unplugged the cable the arm/heads weren't parked yet and the tiniest of vibrations during travel were enough to damage the drive.

I've been recovering my data using ddrescue at a file object level. Just to be clear, I'm not copying the partition or disk.
I'm not sure if this is a good idea or not but there are some reasons:
- I don't need to access the whole disk, just about 50% of it. So less wear and tear?
- The disk is software encrypted. So I'm afraid if I end up with missing or bad blocks in my output image I won't be able to determine which files are corrupt.
- I don't have enough disk space to create an image. Although I've now bought another drive and its on its way.

Recovery speed is excruciatingly slow. On average files are being copied at 100KB/s.
I've noticed that for the same file speed can burst to the normal range of ~50MB/s but then at some point it grinds to a halt. This seems to happens for all big files (think >300 MB).
Is it to be expected to encounter this behavior for all files in the disk? Or maybe just a subset of them?
I'm wondering about the root cause problem and its implications.


For a period of time I could hear some clicking at about a rate of 1 click per second. Some hours later it got louder but was getting half the clicks. Right now no more clicking at all.
I ran smartmoontools when I first noticed the disk speed issues:
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   134   134   051    -    2175
  3 Spin_Up_Time            POS--K   189   185   021    -    5508
  4 Start_Stop_Count        -O--CK   100   100   000    -    109
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    179
10 Spin_Retry_Count        -O--CK   100   100   000    -    0
11 Calibration_Retry_Count -O--CK   100   253   000    -    0
12 Power_Cycle_Count       -O--CK   100   100   000    -    70
192 Power-Off_Retract_Count -O--CK   200   200   000    -    64
193 Load_Cycle_Count        -O--CK   200   200   000    -    2287
194 Temperature_Celsius     -O---K   129   098   000    -    23
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    0
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0

And now
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   001   001   051    NOW  65535
  3 Spin_Up_Time            POS--K   192   185   021    -    5375
  4 Start_Stop_Count        -O--CK   100   100   000    -    110
  5 Reallocated_Sector_Ct   PO--CK   200   200   140    -    0
  7 Seek_Error_Rate         -OSR-K   200   200   000    -    0
  9 Power_On_Hours          -O--CK   100   100   000    -    213
10 Spin_Retry_Count        -O--CK   100   100   000    -    0
11 Calibration_Retry_Count -O--CK   100   253   000    -    0
12 Power_Cycle_Count       -O--CK   100   100   000    -    70
192 Power-Off_Retract_Count -O--CK   200   200   000    -    64
193 Load_Cycle_Count        -O--CK   200   200   000    -    2290
194 Temperature_Celsius     -O---K   111   098   000    -    41
196 Reallocated_Event_Count -O--CK   200   200   000    -    0
197 Current_Pending_Sector  -O--CK   200   200   000    -    5
198 Offline_Uncorrectable   ----CK   100   253   000    -    0
199 UDMA_CRC_Error_Count    -O--CK   200   200   000    -    0
200 Multi_Zone_Error_Rate   ---R--   100   253   000    -    0

From wikipedia:
Quote:
Raw_Read_Error_Rate - Stores data related to the rate of hardware read errors that occurred when reading data from a disk surface. The raw value has different structure for different vendors and is often not meaningful as a decimal number. For some drives, this number may increase during normal operation without necessarily signifying errors.
Current_Pending_Sector - Count of "unstable" sectors (waiting to be remapped, because of unrecoverable read errors). If an unstable sector is subsequently read successfully, the sector is remapped and this value is decreased.

From the userspace side things have been fine. ddrescue hasn't reported any error. The recovered files seem fine too. But from SMART Current_Pending_Sector it looks some of my file(s) I've recovered are no longer intact!?
The disk has been in use for more than 24h now trying to recover the data, and temperature has almost doubled. But still within operational levels. Should I power it off and let it cool off?


I've already copied the most important files.
But at this speed it will take more than a month 24h a day. Which is just not feasible. I'm guessing the drive will die before that time too.
I'm ok with the drive dying. Very annoying but nothing critical.
I'll be rescuing files based on their importance level.

Is there any Linux software I can use to check the disk heads health? I looked into hdparm but couldn't figure out a way.
Or something that lets me understand the root cause and get an idea how much time the drive has? I mean, how much data should I be able to recover?

Any other comments or suggestions?

Thank you for reading.


Top
 Profile  
 
 Post subject: Re: Very slow disk speed [noob] [Linux]
PostPosted: December 14th, 2021, 4:42 
Offline
User avatar

Joined: July 12th, 2010, 4:38
Posts: 1418
Location: Portugal
You might have one or more weak heads, that's why you can get those jumps of speed or you might also have the WD slow issue problem (or both combined).
Maybe better send it to someone who can deal with those problems

_________________
http://www.pclab.com.pt facebook.com/PCLAB.A.T
ACELab partner


Top
 Profile  
 
 Post subject: Re: Very slow disk speed [noob] [Linux]
PostPosted: December 15th, 2021, 16:58 
Offline

Joined: December 13th, 2021, 12:55
Posts: 3
Location: Earth
Drive is now powered off until I get a new drive so I can ddrescue the whole thing, since that still looks it will be the less aggressive way to get the data. My biggest concern is the mapping of any bad blocks to filesystem but maybe I'm lucky and will be able to recover 100%

However I'm confused about what has happened as I was using ddrescue at a file level. Which if I understood correctly uses the OK/kernel but just retries things a few times before failing.
smartclt shows there have been 13 read errors. These are visible in the log and also show accounted for as Current_Pending_Sector.
Conversely there are been 13 errors reported by the kernel.
However I don't understand a few things:

1) Why don't the reported bad/unreadable sectors match?
For example:
smartctl says Error: UNC 256 sectors at LBA = 0xacb60238 = 2897609272
While the kernel says blk_update_request: critical medium error, dev sdb, sector 2897609216 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0

2) How many bytes of data does each error results in being lost?
From smartctl : Sector Sizes: 512 bytes logical, 4096 bytes physical
So from the above: 256 sectors * 512 bytes = 131072 bytes! Is this correct? Or was it only a single sector: 512 bytes?

3) Was there really an unrecoverable error?
I used ddrescue with the mapfile option. However at the end of the execution it stated that all data had been copied. The files in question are video files and so far I wasn't able to detect any visible corruption. Either corruption is minimal (eg: 512 bytes) or non existent at all.

Is anyone able to clarify these questions?
Thanks


(not sure if this will work but...)
Pinging the local Linux expert @maximus
maximus wrote:
.


Top
 Profile  
 
 Post subject: Re: Very slow disk speed [noob] [Linux]
PostPosted: December 18th, 2021, 18:48 
Offline

Joined: December 13th, 2020, 18:47
Posts: 53
Location: recoverland
iklazusf wrote:
Drive is now powered off until I get a new drive so I can ddrescue the whole thing, since that still looks it will be the less aggressive way to get the data. My biggest concern is the mapping of any bad blocks to filesystem but maybe I'm lucky and will be able to recover 100%

However I'm confused about what has happened as I was using ddrescue at a file level. Which if I understood correctly uses the OK/kernel but just retries things a few times before failing.
smartclt shows there have been 13 read errors. These are visible in the log and also show accounted for as Current_Pending_Sector.
Conversely there are been 13 errors reported by the kernel.
However I don't understand a few things:

1) Why don't the reported bad/unreadable sectors match?
For example:
smartctl says Error: UNC 256 sectors at LBA = 0xacb60238 = 2897609272

I've never seen this message. Keep on hiding the log! 8)
Quote:

While the kernel says blk_update_request: critical medium error, dev sdb, sector 2897609216 op 0x0:(READ) flags 0x80700 phys_seg 16 prio class 0

Why should they? How is this related?
Quote:

2) How many bytes of data does each error results in being lost?
From smartctl : Sector Sizes: 512 bytes logical, 4096 bytes physical
So from the above: 256 sectors * 512 bytes = 131072 bytes! Is this correct? Or was it only a single sector: 512 bytes?

Could be, depending on the command given to the drive.
Quote:

3) Was there really an unrecoverable error?
I used ddrescue with the mapfile option. However at the end of the execution it stated that all data had been copied. The files in question are video files and so far I wasn't able to detect any visible corruption. Either corruption is minimal (eg: 512 bytes) or non existent at all.
Wrong conclusion. If I break into your house, your wife hears me, says "go see where the invader is" and the only thing you do is looking in the entrance hall while I hide in your cave, would you ask your wife "Did you really hear an invader?".

Got it?!


Top
 Profile  
 
 Post subject: Re: Very slow disk speed [noob] [Linux]
PostPosted: January 16th, 2022, 11:20 
Offline

Joined: December 13th, 2021, 12:55
Posts: 3
Location: Earth
This post is coming way past due but..

FWIW I was able to recover all the data with ddrescue. It took me about 1 week to create an image of the partition. All data seems to be good.


When I get the time I'll try to experiment with the failing drive.
See if I can send a command or override firmware regarding re-lo module, to fix slow speed. And also scan and mark any bad blocks.
That way it could work as a spare drive for raspberry pi media center. No problem if it fails for good. Just need it to work at decent speed, even if available space is smaller.


Top
 Profile  
 
 Post subject: Re: Very slow disk speed [noob] [Linux]
PostPosted: January 19th, 2022, 7:58 
Offline
User avatar

Joined: December 9th, 2009, 5:31
Posts: 46
Your disk is most likely having problems with an overflowing RList buffer.
Perhaps at least one of the heads (which supports SA) has a write problem.
Then the HDD software erroneously detects that the disk has problems reading certain sectors and adds them to the list of potential defects (RList).
As a result of this, the disk starts freezing and the transfer is very slow. RList number is a 32 module.

After cleaning this module, the drive should return to normal speed - if there are no other problems of course .... :)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 6 posts ] 

All times are UTC - 5 hours [ DST ]


Who is online

Users browsing this forum: No registered users and 81 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group