Strange Data Corruption Issue

Strange Data Corruption Issue
October 20, 2018, 12:28:01 am
I have windows 10 based system with five 2 TB hard drives with 4 DRU's and 1 PPU.  i have stored all my bluray onto the server in mkv format. Now suddenly the MKVs have started to come corrupt. The mkv start to play and then suddenly stop. I have tried to copy movie onto flash drive and then play but still same issue.
I checked and all corrupted files were from hard drive which recently failed and I recovered data via parity drive. I used to run update on weekly basis even then most of the data recovered is corrupted. What is the use of flexraid if it cannot recover data or did i do something wrong? Can somebody tell me how to ensure that this does not happen in case of another hard drive failure. thanks
Re: Strange Data Corruption Issue
Reply #1 on: October 21, 2018, 03:38:51 pm
The issue is that the data on the failing drive got corrupted because the drive was dying and that bad data was then replicated to the parity drive because tRAID is live, rather than a snapshot RAID like RAID-F.  This isn't a failure of tRAID as the same would happen under most other RAID technologies.  Restoring from the parity drive got your data back onto a healthy drive, just as designed.  I'd guess that you likely did not lose everything which was on the failing drive, but actually managed to recover quite a bit.

What you seem to have done "wrong" is consider RAID as a full backup solution.  It's not.  While you do get some redundancy and peace of mind using RAID, any data you want to ensure is not lost must be duplicated.  As you saw, if a drive begins to fail and starts writing bad data, that bad data gets written to the parity drive and cannot be recovered without restoring from a real backup.  Again, the parity drive is not an actual backup.

Most of us feel that movies are not critical to lose, so we don't actually back them up in software.  RAID gives us just enough redundancy, so if a drive fails and we lose a few movies, it's no big deal.  Our actual backup is the physical movie disc.  If this risk isn't acceptable to you, then you would need to double your storage and create real software backups of your movies.

When you say "I used to run update on weekly basis", do you mean running a VERIFY, VERIFY+, or VERIFY SYNC on your tRAID array?  If so, then you're not understanding what that job does.  You can read up on that here.  In short, they verify the parity on the PPU matches the data on the DRUs.  In your case, the corrupted data was sent to the parity drive, so as far as those jobs were concerned the data matched.

One thing which could mitigate your losses is installing HD Sentinel.  That software will monitor your drive's health 24/7 and alert you if one might be failing.  That has saved me many times (especially back in the horrible Seagate days) and, most often, I was alerted a drive was failing long before any data got corrupted.  So, I was able to take the drive out of the array and replace it with a healthy drive.  Also, because I was alerted so early, I was usually able to copy all the data directly off the failing drive and put it on the healthy drive, without restoring from parity at all.  I will never run any computer (server or even a laptop) without this software installed.

Hope that helps!