Status Degraded, no bad drives, read errors, or write errors; what does it mean & what should I do?

  • 264 Views
  • Last Post 10 January 2018
Larry Yaeger posted this 19 August 2017

One of my Pegasus2 R8 units is showing a warning icon in the Dashboard, and clicking on Disk Array it tells me the array's status is "degraded". But all the drives show green in the app, and diving deeper it says there are zero read errors and zero write errors.  What does this mean?  And what should I do?

I have updated all sofware and firmware to the latest. The Promise Utility app is at version 4.00.0000.08 (C01). Firmware (on all three units) is at 5.04.0000.61. Serial number of the problem unit is M92L14A30700065. I'm pretty sure it is registered (as are all the units). It is being used on a Mac Pro (Late 2013) running fully up-to-date macOS Sierra 10.12.6.

Order By: Standard | Latest | Votes
Dinesh Kannusamy posted this 19 August 2017

Hi Larry, 

 

Please create a web support ticket in support.promise.com and attach the subsystem report for further assistance.

Steps to save the subsystem report:

- Open promise utility

- Click subsystem information icon on the top of the Promise Utility window

- Click the lock symbol on the left bottom corner of the screen to unlock the utility

- Click on save service report button to save the subsystem report and attach it to this case.

Below is the Promise KB link to know how to attach files to the web support case:

http://kb.promise.com/thread/how-to-attach-files-to-your-web-case/

Thanks..

Larry Yaeger posted this 06 January 2018

Just a quick follow up, in case anyone else finds themselves in this situation.  I wasn't seeing any bad drives because one of the drive doors had accidentally been popped open while moving a heavy cable.  Nothing in the Physical Drive list showed a problem in the GUI app.  But I took a look at the subsystem report and happened to notice that PhyDrvOnline was 7.  7?  Obviously it should be 8.  And sure enough, looking in the GUI, there were 7 nice green status indicators, and nothing at all for the 8th.  I then took a close look at the enclosure and discovered the 8th drive's door sitting very slightly ajar.  I closed it, and *now* the 8th drive showed up everywhere (both the GUI app and the command-line promiseutil), but it showed up as "dead".  Since I was almost entirley certain there was nothing wrong with the drive, I forced it back online (using commands below), rebooted the unit (after ejecting from the Mac, I pulled the Thunderbolt cable out, waited for the drive to shut off, then reconnected the Thunderbolt cable), and all is well.

Forcing a dead drive back online is not something you should do lightly, and only on the condition that you are genuinely confident the drive is okay.  To do it, you run promiseutil from the command line, then enter:

phydrv -a online -p <drive#>

As I have multiple Pegasus# enclosures and it was the second enclosure that had the problem I had to first enter:

spath -a chgpath -t hba -p 2

And before I forced the drive online I made certain I was dealing with the right unit and had the right drive # by typing: 

phydrv -a list

At first the just reconnected drive showed "Dead" for its status.  After being forced online it showed "Forced On".  After rebooting the enclosure, it just says "OK".  And I think it really is.

I don't know if it was necessary, but I then went into Background Activities and kicked off a Redundancy Check on the unit, just to be safe.  I also bumped up its priority to "High".

P.S. Promise folks... You really should make the GUI show the slot and mark it as empty, in red, when the drive is entirely missing or the door is open like this.

Venkatachalam Settu posted this 06 January 2018

Hi Larry,

The drive will not be listed in the GUI window if it is ejected or accidentally popped out. The drive will be marked as dead if any existing drive is removed or inserted when the unit is running.

Your suggestion may be consider in the upcoming software/firmware release.

Thank you for your valuable suggestion.

Larry Yaeger posted this 08 January 2018

Well, I was surprised how quickly things seemingly went back to normal, and worried they weren't really. And it turns out they aren't. Running that background Redundancy Check I mentioned produced a Major event stating "Redundancy check encountered inconsistent block(s)".  So what next?

Two possibilities come to mind:

1) Force a Synchronization. If that's possible. It is not possible from the GUI.

2) Format the drive. I copied everything off the drive before I started messing with it, so that's a perfectly acceptable thing to do.

Problem is, I don't know if either will fix the problem. More educated suggestions welcome.

I think I'm going to format the drive and rerun the Redundancy Check. I just wish the check didn't take so long, even on High priority, but I guess that's the price of a large array with sophisticated error correction.

Larry Yaeger posted this 08 January 2018

P.S. I had Autofix enabled; any chance that corrected the problem?  (Nothing about the event message suggested this.)

Larry Yaeger posted this 10 January 2018

Okay, following a format of the drive a second Redundancy Check showed no errors. Guess I'm back in business.

Close