Author Topic: Too many failed devices! Failed=2 error on single failed DRU restore - HELP  (Read 832 times)

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Been using FR for about 2 years, experienced first drive failure. I have a nightly process that does a verify and emails me letting me know the volume has been updated. Replaced disk with identical drive, but I keep getting this error message.

1. I can navigate to FlexRAID-Managed-Pool fine and see/can navigate everything.
2. The log file in TRACE shows everything as valid.
3. The UI shows only a single failed drive, and everything else as HEALTHY.

No matter how I try to restore this - inside the pool, to an external drive, etc I get this error.

Attached log file in TRACE mode. It's nothing but "TRACE: Valid" followed by this horrible error. Really need help on this, I was honestly expecting this restore to "just work" but it's not doing anything.

Help, please!

Note: I had to truncate the log to get it to upload. I just removed a bunch of the TRACE: Valid lines on various family pictures and videos, etc.





Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Attached is a picture of my FlexRAID-Managed-Pool if it helps.

My configuration is:

DRU1
DRU2 (The failed one)
DRU3
DRU4
PPU1

Windows 10. Latest FlexRAID. StoragePool via Cruise Control.

I'm completely baffled; everything says VALID including the checking of the Parity. I can see files on all the drives (except the failed one), etc. Everything shows 1 failed DRU, but the restore refuses to do anything saying there are 2 failed devices. What is the other failed device?

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Also what's weird is in the UI it's now showing everything as 0 bytes, and the Info/graphs no longer work. Despite that, the pool mounts fine and shows its appropriate size (minus the one dead DRU). Screenshot attached.

I noticed there was one folder in the FlexRAID-Managed-Pool that didn't exit (there were 6 items in there despite only 5 drives). I moved that elsewhere to see if it would do anything - it didn't. No restores work, UI still shows everything as 0.


Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Can anyone help please? It's been several days and I have nothing else to try that I know of.

Does the product simply not work or have some major bug? It says everything is OK in the Trace, but it refuses to restore anything. Could really use some help; this is really not comforting.

Offline ThreeIsTwoTwoIsOneOneIs..

  • Newbie
  • *
  • Posts: 11
  • Karma: +0/-0
  • Small 4 Drive Pool, for now
    • View Profile
Its a dead product. Most of the userbase is already gone I suspect. Im reviewing alternatives.


What condition is the failed disk in? If its not gone the way of the dodo 8hrs of Spinrite may get it back to a state where you can mount it and pull all the data.
DS380 - C2750 - WD Reds

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
I'm hoping that's not the case...

I still have the drive, but it's got the click of death and won't spin up so I can't pull anything off of it via USB caddy/etc. FlexRAID says there is 1 failed drive, and that the Parity is Valid. If only it would restore! Help please!

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,524
  • Karma: +202/-16
    • View Profile
Its a dead product. Most of the userbase is already gone I suspect. Im reviewing alternatives.


What condition is the failed disk in? If its not gone the way of the dodo 8hrs of Spinrite may get it back to a state where you can mount it and pull all the data.
Far from it.
This is a community forum, and I have simply scaled back from answering every question. ;)

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Far from it.
This is a community forum, and I have simply scaled back from answering every question. ;)

So I'm confused - what should I do here? Are there other people who know the technical details of this product and what to do in this situation?

I bought a product that is supposed to just work in this scenario. The logs say everything is fine. But it won't restore any data because it says 2 drives have failed - but it only shows 1 failed drive. It's been a week and nobody has been able to help. Are you able to help, and if not what does that mean?

I'm just out of luck, I lost all the data on that drive? I assume I should then get a refund since the product didn't do what it's supposed to do?

I could really use some help. Please. And if the answer is nobody is going to help me, at least that's an answer.

Thanks

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,524
  • Karma: +202/-16
    • View Profile
Two things:

1. Post enough info for others to be able to help. If no response to an initial post, then more details is typically needed.

2. If absolutely stuck and no useful input from the forum, do consider paid support.

The challenge with a product that is part of a system where there are many variables is that a lot of troubleshooting is often needed. This include even having to log into the machine to troubleshoot issue. Every product is supposed to work as specified. Often, there are interference from other things in that system as well as user error to consider.

If you are getting a message that there are two failures but are only seeing one drive as being flagged in the UI, that simply means that another drive has a lot of deleted files. Snapshot RAID has some specifics: http://wiki.flexraid.com/2011/10/18/understanding-the-limitations-of-snapshot-raid/

The logs posted were edited, which is a super faux pas. Full logs are always required as the information you are removing is likely the information that others could use to provide you clues with.


So I'm confused - what should I do here? Are there other people who know the technical details of this product and what to do in this situation?

I bought a product that is supposed to just work in this scenario. The logs say everything is fine. But it won't restore any data because it says 2 drives have failed - but it only shows 1 failed drive. It's been a week and nobody has been able to help. Are you able to help, and if not what does that mean?

I'm just out of luck, I lost all the data on that drive? I assume I should then get a refund since the product didn't do what it's supposed to do?

I could really use some help. Please. And if the answer is nobody is going to help me, at least that's an answer.

Thanks

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Thanks.

I can't post the full log, because the forum limits the size and won't let me upload it. So my option was don't post any log, or remove a bunch of essentially identical lines. Essentially IMAGEXXX.JPG 01-5,000 all have the identical log message, so my choice was don't post anything (forum won't let me) or I truncate those identical messages. Is there a better option?

My volume has nothing but family pictures and movies; static binary files. I understand the limitations of Snapshot RAID - but those should be limitations, not "you lose all your data if the product thinks you have too many deleted files." What is too many deleted files? 5? 20? At what point will the product no longer restore, and shouldn't there be some type of warning?

The simple fact that the UI says 1 failed drive, but the restore fails because it says 2 have failed... means there is a bug, or the product is broken. Honestly I don't see how anyone who hasn't worked on building the product can help on that. That doesn't seem like a community member type of question. How to do XYZ, or some basic troubleshooting issues absolutely. But "Sorry we can't restore any of your data because 2 drives have failed, even though only 1 is dead, and even though the UI only says 1 is dead" doesn't seem like a community type of question. I could be totally wrong here.

Just my .02, but I would think understanding why my system thinks there are 2 failed drives even though UI says 1 would be something the engineering team would REALLY like to know. So it doesn't happen to other customers. And every potential customer reading this forum deciding whether they want to buy this product, and seeing my situation.

If you were in my shoes what would you recommend at this point?

Thanks


Two things:

1. Post enough info for others to be able to help. If no response to an initial post, then more details is typically needed.

2. If absolutely stuck and no useful input from the forum, do consider paid support.

The challenge with a product that is part of a system where there are many variables is that a lot of troubleshooting is often needed. This include even having to log into the machine to troubleshoot issue. Every product is supposed to work as specified. Often, there are interference from other things in that system as well as user error to consider.

If you are getting a message that there are two failures but are only seeing one drive as being flagged in the UI, that simply means that another drive has a lot of deleted files. Snapshot RAID has some specifics: http://wiki.flexraid.com/2011/10/18/understanding-the-limitations-of-snapshot-raid/

The logs posted were edited, which is a super faux pas. Full logs are always required as the information you are removing is likely the information that others could use to provide you clues with.
« Last Edit: October 27, 2017, 07:46:04 pm by mephisto »

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Uninstalled everything, re-installed everything and re-created the pool. It saw the existing PPU, and the volume starts up just like normal. Shows all the drives happy as can be with the new replacement drive (H:).

I'm thinking great, this is going to work - let's just restore that failed drive! Tell it to restore and that H: has already been happily replaced. And.... same error:

[2017-10-28 11:02:07,370] ERROR: [restore] error: IOException[Too many failed devices! Failed=2 - Number of devices supported to be restored at a time: 1]
java.io.IOException: Too many failed devices! Failed=2 - Number of devices supported to be restored at a time: 1
   at com.tchegbe.lib.io.a.e.t.a(Unknown Source)
   at com.tchegbe.lib.io.a.e.c.a.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)

Screenshot and log attached.

The pool starts. I can see files from all the drives, and see the files individually in the hidden folder. There are not 2 failed drives. The only files missing, are from the original failed H:, which the entire point of purchasing this product was so it could be restored should it failed. Failed it did, restore it will not.
« Last Edit: October 28, 2017, 12:38:24 pm by mephisto »

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile

If you are getting a message that there are two failures but are only seeing one drive as being flagged in the UI, that simply means that another drive has a lot of deleted files. Snapshot RAID has some specifics: http://wiki.flexraid.com/2011/10/18/understanding-the-limitations-of-snapshot-raid/

Brahim,

According to the link you gave, deletes SHOULD NOT impact my restore. It says:

"Deletes will compromise recovery UNLESS the operations are done through the storage pool and FlexRAID’s proprietary recycle bin feature is turned on"

Everything was done via the storage pool, and the recycle bin feature has always been turned on. So why would this be preventing me from restoring?

What can I do here? It's been 8 days, is there any way to tell this thing to just "restore whatever you can and ignore the deletes" or whatever is making it think there are 2 failed drives?

Would really appreciate some help...

Thanks

Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Also per this post:

http://forum.flexraid.com/index.php?topic=49175.0

"
'See your PM for an alternate build.'

That did it! Thank you! Any extra insight into what the issue was, and what was different in the alternate build?
"

Looks like someone else was receiving the same error, and an alternative build was provided to them and they were able to restore. Is this something I can get?

Offline MediaMan

  • Newbie
  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
My uncle and I were LITERALLY about to purchase FlexRAID and setup home media servers... we're just waiting for our Corsair PSU's to show up. I've been lurking the forum reading up as much as I can while we wait and this issue has really got me re-thinking things... I think we are going to use an alternative solution. We need to be able to restore if a disk fails, and not go into some limbo state for weeks... after all that is the whole point.

I see this same issue posted in a few places, and it looks like there is no resolution. Plus the issue with corrupt parity if a disk dies during update. And then what appears to be really no support here in the forum... (and not even a bug fix release in 2 years??). I really thought this was going to be a great fit for us, but really thinking we are dodging future bullets by looking elsewhere now.

Sorry about your issue, but appreciate you posting your experience!


Offline mephisto

  • Newbie
  • *
  • Posts: 17
  • Karma: +0/-0
    • View Profile
Re: Too many failed devices! Failed=2 error on single failed DRU restore - HELP
« Reply #14 on: November 02, 2017, 09:57:45 am »
Anyone? Help? :)