Author Topic: Failed Disk, attempting to replace  (Read 1712 times)

Offline FrostyWolf

  • Newbie
  • *
  • Posts: 46
  • Karma: +1/-0
    • View Profile
Failed Disk, attempting to replace
« on: October 30, 2014, 12:46:26 pm »
So, I'm doing my best to follow this guide here: http://wiki.flexraid.com/2013/06/27/restoring-a-failed-disk-in-transparent-raid/

I had a failed 3TB disk.

I turned off the array.

I removed it physically, put in a new 4TB disk.

I then registered the new disk. 

Then, I clicked on the dropped disk (no longer in the server) clicked "restore" and then picked the new disk.

The Restore RAID screen came up.

The Total/Remainder numbers represent the size of the entire Raid, not just the failed disk, is that normal?

The raid is 33TB and it is processing it at about 400 MB/s (currently).  Just wondering if that is all normal.

It's my first failure with tRaid.  I have added disks though since I first set up the raid, and they required using the OS disk manager and initializing and creating a volume on the new disk...but I didn't do any of that here, is that going to be a issue?

Also, is using a bigger disk going to be a problem? I don't really see anything in the forums or wiki about it.  I would take a random guess and say because traid seems to deal with the files themselves and not partitions or whatever, that it would simply  put whatever files were on the failed drive on the new drive, and the rest of it will be empty, but since I never was asked to create a volume or partition or what not, I'm not sure how traid determines how much to use when it creates one it self.  Is it going to go based off the partition size of the failed disk, or is it just going to use the entire new disk?
« Last Edit: October 31, 2014, 07:03:49 am by FrostyWolf »

Offline FrostyWolf

  • Newbie
  • *
  • Posts: 46
  • Karma: +1/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #1 on: October 31, 2014, 02:42:39 pm »
Well, it didn't work.  I'm not sure what went wrong but any help would be appreciated. 

The new drive is now has a 3TB raw partition on it and a 1TB unallocated, and now none of the files that where on that disk are showing up in the storage pool.

Best I can tell what went wrong is that I should not have taken out the bad disk and replaced it with a new disk and then clicked restore and selected the new disk.

I should have either:

1) Left the bad drive in, installed the new drive, clicked restore and selected the new drive

2) Took the bad drive out, installed the new one, configuration override, pick swap, swap the bad for the new, then click restore and leave it to restore to same disk.

I'm not 100% sure though.  I feel like I'm either a idiot or the restore instructions aren't really clear...it says
Quote
Once a disk has failed, restoring it is very straight forward. Click on the “Restore” button and choose the replacement disk

To chose the replacement disk, it has to be in the machine and registered.  I figured to put it in the machine, it would be assumed I would also be removing the bad disk, not leaving it in there and taking it out latter.

Maybe that wasn't the problem though, I have no idea.

I any case, I lost the data that was on that disk.  Instead of waiting for a reply (I'm running over 48 hours of down time at this point) that may or may not help me recover the data, I decided to bring the array down, resize the partition to the full 4TB, format it NTFS, bring the array back up, then put the bad disk back in and copy the data off of it back into the raid, as the disk hasn't failed completely, it just has some bad sectors.  I'll re download the data in the bad sectors and be done with it.

I would like to know for future reference what is the preferred method, step-by-step, from identifying the failed disk, to removing it, to adding a new one, to registering a new one, etc, etc, to recover after a failure, as the linked wiki in my first topic seems a little unclear, at lest to me.



Protip: Don't use "restore" button.
« Last Edit: October 31, 2014, 03:59:24 pm by FrostyWolf »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,332
  • Karma: +199/-15
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #2 on: October 31, 2014, 03:35:28 pm »
You need to restore to an identical disk. Only in RAID-F are you able to restore to a disk of different size.
With tRAID the disk layout as well as the data it contains is restored faithfully. What you are seeing is very much as expected.
The partition is correct as being 3TB, but the disk layout is not and does not match that of the 4TB disk.

Your true option in tRAID when all you have is a disk of different size, is to simply copy the data from the failed disk to the new disk.
Transparent RAID rebuilds the failed disk live. So, all you had to do is give the tRAID disks drive letters and copy the file content from the failed disk to the new disk to be. After that, you would simply discard the RAID configuration and create a new one using the resulting set of disks.

Offline FrostyWolf

  • Newbie
  • *
  • Posts: 46
  • Karma: +1/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #3 on: October 31, 2014, 03:39:56 pm »
You need to restore to an identical disk. Only in RAID-F are you able to restore to a disk of different size.
With tRAID the disk layout as well as the data it contains is restored faithfully. What you are seeing is very much as expected.
The partition is correct as being 3TB, but the disk layout is not and does not match that of the 4TB disk.

Your true option in tRAID when all you have is a disk of different size, is to simply copy the data from the failed disk to the new disk.
Transparent RAID rebuilds the failed disk live. So, all you had to do is give the tRAID disks drive letters and copy the file content from the failed disk to the new disk to be. After that, you would simply discard the RAID configuration and create a new one using the resulting set of disks.

I see, thank you for that clarification.  So to be extra clear here, the reason I ended up with a RAW partition was because the disk wasn't identical?

My best option would have been to give the tRaid disk a drive letter, then simply copy the file content over, then recreate the raid from scratch.  Thank you for the input.

Offline djvj

  • Jr. Member
  • **
  • Posts: 52
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #4 on: November 01, 2014, 08:03:42 am »
How do you copy data from a failed disk to a good one when the disk is failed? It could be completely dead!

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,332
  • Karma: +199/-15
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #5 on: November 01, 2014, 10:46:56 am »
How do you copy data from a failed disk to a good one when the disk is failed? It could be completely dead!
Transparent RAID reconstructs the failed disk live for you. That's one of its key features. You can then assign a drive letter to that disk and copy the files specifically on that disk to another disk.
So, you are typically unaffected by a disk failure from a data access standpoint.

Offline docchris

  • Newbie
  • *
  • Posts: 5
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #6 on: November 27, 2014, 03:33:43 am »
Oh i see,

several times i have read you advising people to copy the files from the failed disk, and always though that made no sense.

But in fact - you are advising them to copy the files from the failed "transparent" disk, not the physically failed drive!

i think you may need to be a little clearer, as every time i have read that i have envisaged removing the failed physical drive, and trying to copy the data off it

Offline lahatte

  • Newbie
  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #7 on: March 13, 2017, 05:44:45 pm »
How exactly do you copy the files over to a new drive? Do you do it in windows? or within Flexraid? or?

Thanks

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,332
  • Karma: +199/-15
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #8 on: March 14, 2017, 11:02:13 am »
How exactly do you copy the files over to a new drive? Do you do it in windows? or within Flexraid? or?

Thanks
You do it in Windows. You can assign drive letters to the tRAID disks (not source disks, but the virtual disks created by tRAID).
So, it is just a matter of identifying the virtual disk representing the failed disk, and copying the contents of that virtual disk (tRAID disk).

Offline lahatte

  • Newbie
  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #9 on: March 20, 2017, 01:41:06 pm »
You do it in Windows. You can assign drive letters to the tRAID disks (not source disks, but the virtual disks created by tRAID).
So, it is just a matter of identifying the virtual disk representing the failed disk, and copying the contents of that virtual disk (tRAID disk).

The old drive is dead though. So how will windows copy the data over to a new drive?

Offline adridolf

  • Jr. Member
  • **
  • Posts: 85
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #10 on: March 20, 2017, 05:30:36 pm »
Reread the thread and specifically the post by docchris. Technically, it's all been explained ...

Offline lahatte

  • Newbie
  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #11 on: March 20, 2017, 07:21:14 pm »
Reread the thread and specifically the post by docchris. Technically, it's all been explained ...

Yes, i read that several times... i guess i'm lost on how to detect the correct virtual drive? I'm not technical enough to "get it" i guess...

Offline adridolf

  • Jr. Member
  • **
  • Posts: 85
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #12 on: March 21, 2017, 05:22:16 am »
Let's say you have 3 DRU drives with identical size. Then your disk management console (Windows) will show you the three physical drive (marked Offline), three "transparent" drives and one "pool" drive. The labels of the three transparent drives should correspond to volume labels of the partitions you initially had on the physical drives. If you now disable/stop the pool (not the array) in the Web UI, you can access all three of the individual "transparent" drives if you assign drive letters to them.

If one of your physical disks has failed, the disk management console may show only two of the physical drives, but the three transparent ones should still be there (the one corresponding to the failed disk being virtualized based on parity data). If you are not sure which is which, remember that the transparent disks are those looking like normal disks, while the physical ones are marked "offline".

The only problem might arise from finding out which transparent disk corresponds to the failed physical disk, in case you did not give names properly enough in the first place. (The Web UI shows names for the DRU disks, and I consider it good practice to name the volumes on the transparent disks accordingly.)

Offline lahatte

  • Newbie
  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #13 on: March 21, 2017, 09:19:23 am »
Let's say you have 3 DRU drives with identical size. Then your disk management console (Windows) will show you the three physical drive (marked Offline), three "transparent" drives and one "pool" drive. The labels of the three transparent drives should correspond to volume labels of the partitions you initially had on the physical drives. If you now disable/stop the pool (not the array) in the Web UI, you can access all three of the individual "transparent" drives if you assign drive letters to them.

If one of your physical disks has failed, the disk management console may show only two of the physical drives, but the three transparent ones should still be there (the one corresponding to the failed disk being virtualized based on parity data). If you are not sure which is which, remember that the transparent disks are those looking like normal disks, while the physical ones are marked "offline".

The only problem might arise from finding out which transparent disk corresponds to the failed physical disk, in case you did not give names properly enough in the first place. (The Web UI shows names for the DRU disks, and I consider it good practice to name the volumes on the transparent disks accordingly.)

when I right click on "This PC" (Win 8.1) then click "Manage" then click "Disk Management" an "Initialize Disk" window pops up asking me to initialize a disk to MBR or GPT... I assume the 2 drives listed are the bad ones?

What are the steps to copy the virtual bad drive to the new good drive? Is that done in Windows or Flexraid interface? How do i select the bad drive and actually command it to copy over to the new drive... "is there a button for that" lol

Offline adridolf

  • Jr. Member
  • **
  • Posts: 85
  • Karma: +0/-0
    • View Profile
Re: Failed Disk, attempting to replace
« Reply #14 on: March 22, 2017, 08:46:44 am »
when I right click on "This PC" (Win 8.1) then click "Manage" then click "Disk Management" an "Initialize Disk" window pops up asking me to initialize a disk to MBR or GPT... I assume the 2 drives listed are the bad ones?

This means that the disks listed do have neither partitions nor a partition table. This is what you typically have for completely new disks. Since I fortunately have not encountered something like this so far, I would GUESS that those are the defective disks, where the disks themselves are recognized by the driver, but nothing on it is (maybe someone else can help out here).

However, if you have two failed drives, you would also require two PPUs to reconstruct, otherwise you just have what's left on the working disks.

To identify the physical disks, you can just look into the tRaid Web UI, where the disk numbers (indices) are listed next to the disks in the "Physical Drives" list. Those with the correct drive name are the physical disks, while those labeled "NZFS SAS Device" are the transparent ones. You can also go to the "Registered Disks and Volumes" tree and right-click -> "Open/View details" to identify based on the disk names you gave in tRaid.
If you now know the disk index number "Disk X", this will be the same as the ones indicated in the "Initialize disk" dialog. So now you know which (physical) disks are actually defective. Note that can just cancel the "Initialize Disk" dialog and nothing will happen, but you will get to the main Disk Management window. There you can then try to identify the transparent disks (again by the drive index) and assign drive letters to them.

Have a look at the attached picture.