Hard drive warranty expiring soon - cannot be detected in unraid but can be read in PC

I have an unraid server which recently complained that one of the SATA ports is missing a drive. It currently has a 4TB WD Red with its 3years warranty expiring 10th Feb so I would like this to be replaced by WD - but the problem is that whilst this drive cannot be detected in unraid - a new 4TB Iron Wolf that i just bought this morning is detected fine. I swap the WD red back onto the unraid server and the same thing happens again - it fail to detect the WD Red - and yet another swap to the Iron Wolf unraid can detect the iron wolf fine - so i doubt it is my PC playing up.

I connect the WD Red onto my PC and it is reporting that its SMART status is fine - so i doubt WD will honour the warranty in such a case - but I am of the view that the hard drive is failing - but i just cannot prove it.

Any ideas of what i can do to confirm the actual health status of the hard drive?

Thanks in advance.

Comments

  • Bear in mind that the drive will need to go to HK for warranty. Given they can be had for around $120 right now, is it worth the hassle?

    Was it bought as a standalone drive, or shucked?

    • Thanks bro. It was a 4TB WD Red standalone drive that I bought from PLE. Not a shucked drive.

      Edit: I bought a new 4TB Iron Wolf this morning so unraid is rebuilding the drive and there is no data loss. What can i do to confirm the health of the failed WD Red?

  • I do think your hard drive is fine. check with someone for configuration techniques.

  • +4

    I had a similar issue about a year ago. The drive appeared to be failing and it's contents was only accessible from the parity drive. It turned out it was one of the SATA/power plugs to the drive. After changing it and rebuilding the disk it hasn't had an issue

    • +5

      I wanted to comment this exactly — I have a NAS running Unraid, also had the same issue of random errors + disk not being detected.

      I was using some really flaky SATA cables; literally those packs of like 20 for $5. Never would have thought that the cable would make such a huge deal!

      • +3

        Same experience for me in the past as well. I thought it would be nice to get a few "almost free sata cables" from aliexpress because of their $1 off $1.01 coupons.

        It took my local computer store a few days to figure it out, but everything turned out fine in the end.

    • Thanks bro. That's what i thought also initially. I double check the sata connection and no luck.

      • I just want to clarify that this problem disk isn't detected by unRAID but from another PC, it shows up?

        • yes indeed. The WD Red works fine for 3 years straight and since yesterday it is no longer detected by unRAID but works fine if i connect to my linux PC. Smart shows the disk is fine.

          My new Iron Wolf when plug into the same SATA port on unRAID works immediately. no issues at all.

  • On the assumption it's a data or power cable, have you shuffled your drives around to see if the problem follows the drive or the cables?

    • yes. done that.

      For clarity, the drive that fail is connected as "Disk 7" on the Unraid array.

      As unRAID's disk assignment follows the drive ID number and is independent of the sata port, so I re-arrange the other SATA drives around and populate one of the pre-existing drives onto the sata cable for "Disk 7" - Unraid picks up that drive no issue - but when i plug that WD Red to disk 5, it doesnt get detected again - so i pretty much rule out it was my PC issue but rather the drive itself.

      I just need to prove it - but i cant.

      Edit: here is a partial print from the Smart

      === START OF INFORMATION SECTION ===
      Model Family: Western Digital Red
      Device Model: WDC WD40EFRX-68N32N0
      Serial Number: WD-WCC7K1VCXX42
      LU WWN Device Id: 5 0014ee 2120d1110
      Firmware Version: 82.00A82
      User Capacity: 4,000,787,030,016 bytes [4.00 TB]
      Sector Sizes: 512 bytes logical, 4096 bytes physical
      Rotation Rate: 5400 rpm
      Form Factor: 3.5 inches
      Device is: In smartctl database [for details use: -P show]
      ATA Version is: ACS-3 T13/2161-D revision 5
      SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
      Local Time is: Tue Jan 10 17:33:40 2023 AWST
      SMART support is: Available - device has SMART capability.
      SMART support is: Enabled
      AAM feature is: Unavailable
      APM feature is: Unavailable
      Rd look-ahead is: Enabled
      Write cache is: Enabled
      DSN feature is: Unavailable
      ATA Security is: Disabled, NOT FROZEN [SEC1]

      === START OF READ SMART DATA SECTION ===
      SMART overall-health self-assessment test result: PASSED

      General SMART Values:
      Offline data collection status: (0x00) Offline data collection activity
      was never started.
      Auto Offline Data Collection: Disabled.
      Self-test execution status: ( 0) The previous self-test routine completed
      without error or no self-test has ever
      been run.
      Total time to complete Offline
      data collection: (43440) seconds.
      Offline data collection
      capabilities: (0x7b) SMART execute Offline immediate.
      Auto Offline data collection on/off support.
      Suspend Offline collection upon new
      command.
      Offline surface scan supported.
      Self-test supported.
      Conveyance Self-test supported.
      Selective Self-test supported.
      SMART capabilities: (0x0003) Saves SMART data before entering
      power-saving mode.
      Supports SMART auto save timer.
      Error logging capability: (0x01) Error logging supported.
      General Purpose Logging supported.
      Short self-test routine
      recommended polling time: ( 2) minutes.
      Extended self-test routine
      recommended polling time: ( 462) minutes.
      Conveyance self-test routine
      recommended polling time: ( 5) minutes.
      SCT capabilities: (0x303d) SCT Status supported.
      SCT Error Recovery Control supported.
      SCT Feature Control supported.
      SCT Data Table supported.

      SMART Attributes Data Structure revision number: 16
      Vendor Specific SMART Attributes with Thresholds:
      ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
      1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0
      3 Spin_Up_Time POS—K 171 171 021 - 6416
      4 Start_Stop_Count -O—CK 100 100 000 - 207
      5 Reallocated_Sector_Ct PO—CK 200 200 140 - 0
      7 Seek_Error_Rate -OSR-K 100 253 000 - 0
      9 Power_On_Hours -O—CK 080 080 000 - 15266
      10 Spin_Retry_Count -O—CK 100 100 000 - 0
      11 Calibration_Retry_Count -O—CK 100 253 000 - 0
      12 Power_Cycle_Count -O—CK 100 100 000 - 61
      192 Power-Off_Retract_Count -O—CK 200 200 000 - 43
      193 Load_Cycle_Count -O—CK 200 200 000 - 450
      194 Temperature_Celsius -O—-K 128 108 000 - 22
      196 Reallocated_Event_Count -O—CK 200 200 000 - 0
      197 Current_Pending_Sector -O—CK 200 200 000 - 0
      198 Offline_Uncorrectable ——CK 100 253 000 - 0
      199 UDMA_CRC_Error_Count -O—CK 200 200 000 - 0
      200 Multi_Zone_Error_Rate —-R— 100 253 000 - 0

      • Oof, that makes it much tougher.

        If you don't have a use for it outside your RAID array then it might be worth selling as-is since your SMART data is fine; or otherwise install a backup array in a different PC?

        • I can stick this into my old HP54L as a standalone drive to continue making use of it - as the worse case scenario - if i cant get this replaced by WD - but it will be great if I can get this RMA before the warranty is up.

          I would really expect the NAS drives to last longer - but over the course of the last year, I have had (including this latest one) 3 x WD Red that failed at different times.

  • +2

    I work in data recovery. The SMART data is rarely ever wrong, but it can happen. If SMART is showing no pending or reallocated sectors then its very likely the drive is fine. But the way to test it would be to connect it normally to your PC, download a free version of HD Sentinel and run a read only surface scan. You can let it run over the whole thing if you want to be sure, but if you're not getting any errors after 10 minutes its extremely unlikely the SMART is wrong.

    If its reading just fine in Windows then I can't think of a good reason why it would suddenly not able to be recognised in the server, but if you need the data the next thing to do would be make a clone of it onto another 4tb drive, that you've tested and know for sure the server is recognising, and the replace the WD in the server with that. You can use something like ddrescue in Linux or HDD Raw Copy in Windows.

    After you've cloned the drive and the data on it is no longer important, you can run a write test over the drive too in HD Sentinel. If the drive is reading and writing just fine in Windows, then the only conclusion is the drive is fine and it was just some weird incompatibility problem in the server maybe caused by a software update or something.

    • Thanks mate. That is a good idea. I will do as you suggested.

  • WD aren't going to really check the drive for the warranty claim. They'll just ship out a recertified drive to you.

Login or Join to leave a comment