Recent Posts

Pages: [1] 2 3 ... 10
1
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 14, 2017, 07:55:14 pm »
...I am sure many users are like myself, we read fully the documentation, set it up in the best way possible, and in 5 years when a disk fails, we expect it to work. ..
You are still missing the core of it. The software will not fix your hardware issues. Testing whether the product works as intended is rather trivial given its nature. So, setup a test array (maybe in a VM) and practice the recovery aspect to gain confidence. You can even test just using folders. Defects are relatively easy to prove out and should be logged on http://bug.flexraid.com.

The point I am trying to get across is that users need good troubleshooting skills when going with custom built systems. If this was a fully integrated system where we also sold the hardware, then it would be a different story. Those not keen on troubleshooting their own hardware should not build one. Fully integrated systems have their place.
There is no way to write a guide for the various systems that get put together nor is it for us to troubleshoot them. So, the tips provided here are super important to follow.

Everything about your troubles circle back to not following the crucial steps I previously outlined. None has to do with frequency of Update, Verify, or what not. It is strictly a matter of troubleshooting the issues you had at hand. As you have not resolved them, you are likely to face them again.

So:
1. make sure to rebuild the parity anew
2. anytime the process seems to hang, check the Windows event viewer for clues, open Resource Monitor and check for CPU and disk usage, even go one step further and use Process Monitor to check for which files are currently being accessed by the RAID-F service process
3. resolve any issue before retrying (blindly retrying with fingers crossed is a bad strategy)

How often you run Validate and Verify is truly up to you. Obviously, the more frequent the better. :)
2
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by johnzm on December 14, 2017, 05:26:28 pm »
It really does not matter how long it takes. Again, it will take as long as it needs to. If disk activities drop because of disk issues, then that's just what it is. The proper process is to look at the Windows event viewer for potential clues, and look at Resource Monitor and/or Process Monitor for additional detail. If these show that it is hung on a file, then you have enough info to take the most appropriate action. For instance, you could abort the Restore and then use a file repair tool to correct the problematic file. What you don't do is blindly stop the task and retry it hoping that in itself would fix things.

RAID-F needs a healthy file system to operate over. So, running chkdsk and the like to ensure your file systems are healthy can sometime be needed.

Re-build your parity anew. Don't short cut it. If parity cannot be build without issues, then resolve all issues at each step. Building parity anew will be a good test of the entire system (hardware, drivers, file system, etc.).

I think maybe there needs to be some additional info in logging perhaps or something? as a user i see disk activity stop at 2 hours, and for 8 more hours the log sits on a txt file. no disk activity(these checks generate major disk activity its no secret when you have 20+ drives flashing in your face). it takes somewhere around 12 hours to build a new parity i believe. I am sure many users are like myself, we read fully the documentation, set it up in the best way possible, and in 5 years when a disk fails, we expect it to work. ill admit i do not know all the in's and outs, but what i expected was to follow the documentation- replace drive for same size or larger, assign same drive letter, and restore via parity.

yes sometimes things dont work, and i appreciate the support since you have responded, which is more than some even bigger places can handle! but myself and others use this parity and are placing a large amount of trust in it.

can you please outline some steps i can take in the future, after i rebuild this parity, to maintain, and verify its integrity?

for instance-
updates once a week (i do this currently)
validate once a month
verify every 2 months.

Thank you for your time
3
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 14, 2017, 11:22:36 am »
It really does not matter how long it takes. Again, it will take as long as it needs to. If disk activities drop because of disk issues, then that's just what it is. The proper process is to look at the Windows event viewer for potential clues, and look at Resource Monitor and/or Process Monitor for additional detail. If these show that it is hung on a file, then you have enough info to take the most appropriate action. For instance, you could abort the Restore and then use a file repair tool to correct the problematic file. What you don't do is blindly stop the task and retry it hoping that in itself would fix things.

RAID-F needs a healthy file system to operate over. So, running chkdsk and the like to ensure your file systems are healthy can sometime be needed.

Re-build your parity anew. Don't short cut it. If parity cannot be build without issues, then resolve all issues at each step. Building parity anew will be a good test of the entire system (hardware, drivers, file system, etc.).
4
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by johnzm on December 14, 2017, 04:44:33 am »
i actually looked back an left out some important information. the screenshot didnt indicate what was going on because while i showed you time stamps i left out the local time.

the log was stopped on reading a txt file, and it was on that file for 8 hours.  disk activity had dropped to 0 and flexraid (per the taskmgr) was using around 20% cpu on what i dont know.

i could separate the array out into smaller pieces. its currently a 20tb single array. im already using expert mode because i wanted to include drives that are not part of the main array into the parity for protection.

i read the notes on snapshot, and i was not doing anything out of the ordinary. rare edits and mostly deletes and maybe once every 3 years some moving things around.

i have my backed up data moved to the array, whats the best way to ensure that my parity is fully up to date, without starting a new one (i want to check and make sure there's no corrupted files on any of my drives)
5
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 13, 2017, 08:04:48 pm »
The you spent those days doing everything but the correct thing to do. ;)
You can't rush it. It will take as long as it needs to take. You have a very large array with a lot of disks and likely lot of files. All you had to do was let it run its course.

If Resource Monitor and Process Monitor show it using CPU and accessing disk, then it is active. It will only as fast as your system let's it.
Another place to look is in the Windows event viewer for additional clues such as disk/file system errors.

Clearly, you should rethink having all those disks under a single array. You should configure multiple smaller Snapshot arrays in Expert mode and then pool them all under a single pool.
6
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by johnzm on December 13, 2017, 04:24:33 pm »
ok. so im giving up on the parity restore.  i've got 5 days of downtime, and 3 of those days is troubleshooting.

i need to know what the cause is. i have the data from the lost (intermittent disconnecting) drive, and will be moving it manually to the new drive tonight which is the same drive letter.

if i run a verify will it tell me whats broken on the array, so that i can resolve the issue?
7
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 13, 2017, 04:07:53 pm »
Folders are never an issue. Having copies of a folder across drives is normal. That's how the pooling works. Data is spread across drives for data balancing.

In your pooling configuration, you can switch to folder priority if you wish (http://wiki.flexraid.com/about/flexraid-storage-pooling/). Again, please spend some time on the wiki to understand these rather key points of the product you are using. At least, pay attention to your configuration options and what they mean.
8
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by johnzm on December 13, 2017, 12:50:22 pm »
i just ran a scandisk on the drive with the listed file that its been stopped on and that drive had no issues.

while searching for that file, i did find an odd occurrence,
i have multiple copies of the same folder all over my server. some of the folders have no data in them, some have one or all of the files in them.  essentially this is duplicated data over the array. i dont ever do any work on these drives outside of the flexraid drive thats created, so i find that very odd.

do you have any other suggestions?
9
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 13, 2017, 12:28:40 pm »
Also, give Process Monitor a try for more detailed access file access info: https://docs.microsoft.com/en-us/sysinternals/downloads/procmon
10
Snapshot RAID / Re: hanging on the restore pre-validation process
« Last post by Brahim on December 13, 2017, 12:18:49 pm »
I did not say Task Manager. I said Resource Monitor.
Look at both CPU and disk usage. If either is active, then it is active. Slow or hung disk activity could be tied to bad sectors/files. To fix those, run chkdsk on your drives and then retry the Restore.

Also, make sure you don't have an anti-virus or the like intruding. You should disable those or configure them to ignore the RAID-F service process's activities.
Pages: [1] 2 3 ... 10