Author Topic: bitrot prior to parity update?  (Read 11416 times)

Offline Shadowsoul

  • Newbie
  • *
  • Posts: 49
  • Karma: +0/-0
    • View Profile
bitrot prior to parity update?
« on: December 24, 2013, 05:20:39 am »
Hi, couldn't find an answer to this other than a claim it is a problem, so asking it here.

Does Flexraid do a parity validation on a file before using it to calculate parity during an update?

The scenario i'm thinking of is if an existing file gets a silent error and is then used for a parity computation, that error would then be part of any restore after that right?
Can Flexraid prevent this automatically somehow or is scheduled validates the only option(which means there is another issue in that they require the update to run beforehand, thus killing any chance of them to detect the error)

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #1 on: December 28, 2013, 08:32:00 pm »
I've just tested this, and there is no detection of silent data error when you run Update. If your silently corrupted data are part of existing files which are used for parity recalculation alongside with newly detected/updated files it will propagate silently there.

If data rot is not part of any files used for actual parity recalculation during update it won't automatically propagate to parity (since modification dates didn't change)
But it can happen anytime on any of your next update runs in case they will be part of parity file that will be updated in the process. In meantime you have a chance to detect them with verify/validate and recover them from healthy PPU. But once they are propagated to parity you won't be able to recover them in original state.

I was kinda surprised this happens and was expecting some kind of checksum validation on existing files used for parity recalculation too.
I guess your only chance is to validate before using update to minimize datarot propagation to parity. But that's not ideal solution imo.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,385
  • Karma: +201/-15
    • View Profile
Re: bitrot prior to parity update?
« Reply #2 on: December 29, 2013, 07:28:07 am »
If you have such concern, do run the Validate task prior to Updates.
Forcing it as part of Updates will lengthen such operations.

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #3 on: December 29, 2013, 10:00:28 am »
You're right, but validating only the files which are part of update process would still be much faster than validating everything prior update.
What about some kind of advanced setting (=making it optional), wouldn't that be win/win for everyone? ;)

Offline Quaraxkad

  • Sr. Member
  • ****
  • Posts: 381
  • Karma: +24/-1
    • View Profile
Re: bitrot prior to parity update?
« Reply #4 on: December 29, 2013, 10:52:25 am »
Validating files that have changed and are going part of the next update doesn't make sense. They will not match the parity because they have been changed and need to be updated, so they will obviously fail validation. That doesn't mean the failure is because of bitrot.

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #5 on: December 29, 2013, 11:54:03 am »
We are talking about validation of files which have NOT been changed (modified date/time remains the same, but they have bit rot silently so their CRC won't match when you run validate).
For FlexRAID they are the same old files until you run validate/verify where you find out something corrupted them.
Such files can still be part of parity recalculation process alongside the files which are actually detected as new/modified/deleted.

Offline Quaraxkad

  • Sr. Member
  • ****
  • Posts: 381
  • Karma: +24/-1
    • View Profile
Re: bitrot prior to parity update?
« Reply #6 on: December 30, 2013, 09:54:28 am »
But files that have not changed are not part of an update process, so their bitrot will not propagate into the parity. Those files will be detected during a Verify or Validate. The only time where bitrot will propagate into the parity is when a file that has changed has silent bitrot, FlexRAID will assume the changes are intentional and update the parity to match the now corrupted file. You would never know it, and running a verify or validate on that same file would not help either because the file was changed intentionally regardless of the bitrot.

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #7 on: December 30, 2013, 10:29:57 am »
You're not right, files that have not changed are also part of an update process. That's how does parity calculation work.
1) you have files which are new/changed/deleted
2) but to recalculate parity you have to also use some existing files from every DRU that is part of an updated parity block.

You can't magically recalculate parity just with information from PPU and new/changed/deleted files.

Offline tonym

  • Jr. Member
  • **
  • Posts: 88
  • Karma: +3/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #8 on: December 31, 2013, 09:44:59 am »
ezechiel1917's conclusion that a silent data error (bit rot) in a data file which later becomes used as part of a parity re-calculation is bad news. And this runs contrary to the final conclusion from a similar thread "Avoid corruption propagation: Verify before updating ?", Aug 2012 see http://forum.flexraid.com/index.php/topic,1111.msg9081.html#msg9081

In this earlier thread, I concluded (undisputed) that a silent data error within a file to be part of the parity recalculation would cause the Update to fail i.e. the bit rot would be detected. This conclusion was based upon Brahim's comments in a post Oct 2011 "Difference between update/verify/validate" see http://forum.flexraid.com/index.php/topic,23.15.html) in which he states:
"...the update task does its own validation to determine if it is safe to update first before proceeding"
and also
" Data rot is only an issue when restoring, and the restore process will tell you of it (albeit too late)"

Please Brahim, clarify for us once and for all. And if this how RAID-F behaves, please consider validating those files involved in a parity update before the update is proceeds to change the parity files. It may still be too late to recover the data rot corrupted file, but at least it's existence is detected.

Offline Shadowsoul

  • Newbie
  • *
  • Posts: 49
  • Karma: +0/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #9 on: December 31, 2013, 09:45:56 am »
Hello everyone, sorry for not responding faster, been enjoying vacation :P

If you have such concern, do run the Validate task prior to Updates.
Forcing it as part of Updates will lengthen such operations.


Is this really possible? If there are any changes in the array won't the validate action always fail the changed files, which means it will be all but impossible to sort them out from silent data corruption?



You're not right, files that have not changed are also part of an update process [..]

This is exactly the thing I'm thinking about yes.
It is a small risk but it does exist, which is why most of the "real" RAID-schemes I've seen (as well as one or two of the snapshot-to-parity solutions ) are doing a checksum validation when reading of the files.


You're right, but validating only the files which are part of update process would still be much faster than validating everything prior update.
What about some kind of advanced setting (=making it optional), wouldn't that be win/win for everyone? ;)

Exactly what I was thinking as well..considering how relatively fast an update is when only doing it on smaller deltas (as I think most of us do, daily or weekly) I for one would gladly know that the risk of getting a silent error introduced is a lot less than just hoping that nothing has been corrupted since last I ran a validate/verify.

Offline Shadowsoul

  • Newbie
  • *
  • Posts: 49
  • Karma: +0/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #10 on: December 31, 2013, 09:55:05 am »
ezechiel1917's conclusion that a silent data error (bit rot) in a data file which later becomes used as part of a parity re-calculation is bad news.

Agreed, bad news. I was not sure how to test it myself, hence I asked.
Maybe ezechiel1917 can tell us how he tested it? (I'm guessing it's possible if one were to edit the raw files in the disks by removing them from the raid or so and make sure that the modified date is the same?)


[..] "Difference between update/verify/validate" see http://forum.flexraid.com/index.php/topic,23.15.html) in which he states:
"...the update task does its own validation to determine if it is safe to update first before proceeding"
and also
" Data rot is only an issue when restoring, and the restore process will tell you of it (albeit too late)"
[..]

I'm not really sure what is meant with this..what kind of validation is done by the update task? If it actually verifies the checksum then I'll breathe a sigh of relief, but from one of the threads it seems like it only checked to see that the files still exists?


Please Brahim, clarify for us once and for all. And if this how RAID-F behaves, please consider validating those files involved in a parity update before the update is proceeds to change the parity files. It may still be too late to recover the data rot corrupted file, but at least it's existence is detected.

I can only agree in asking very politely for a clear answer on this. (And to add my vote to please consider adding at least the option to enable a validate on all files involved in an update-task).

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #11 on: December 31, 2013, 12:30:31 pm »
Agreed, bad news. I was not sure how to test it myself, hence I asked.
Maybe ezechiel1917 can tell us how he tested it? (I'm guessing it's possible if one were to edit the raw files in the disks by removing them from the raid or so and make sure that the modified date is the same?)


I'm not really sure what is meant with this..what kind of validation is done by the update task? If it actually verifies the checksum then I'll breathe a sigh of relief, but from one of the threads it seems like it only checked to see that the files still exists?


I can only agree in asking very politely for a clear answer on this. (And to add my vote to please consider adding at least the option to enable a validate on all files involved in an update-task).

You can test this scenario by (hex) editing an existing file (leaving any file attributes untouched) which is part of a parity block which will be updated by next Update. (You can find out which parity blocks will be updated just by running an update with a test new file which is about to be included and check the log for all files that are actually processed during it)
If you run such an update (with this edited file and a new file) it finishes successfully without any warning.
Delete this edited file, recover it from parity and it recovers it in it's edited version with a warning that it is corrupted (FlexRAID knows healthy CRC checksum of unedited version).
Now you'll know something went wrong, but at this point you have no chance to do anything about it since it is propagated to parity already.

I think what Brahim simply meant is what he said in his previous sentence. In other words that Update will prevent recalculating PPU if there is general problem with any of DRU paths (failed drive/path not connected etc.)
« Last Edit: December 31, 2013, 01:04:07 pm by ezechiel1917 »

Offline tonym

  • Jr. Member
  • **
  • Posts: 88
  • Karma: +3/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #12 on: December 31, 2013, 01:56:20 pm »
How do we explain Brahim's second statement in the old post Difference between update/verify/validate "Data rot is only an issue when restoring, and the restore process will tell you of it (albeit too late)."?
[Update... Re-thinking this comment, I can now see that data rot would be detected on a restore, but only if the parity block for the corrupt file hadn't been re-computed.]

It would make eminent sense for Update to check all relevant file CRC before updating the parity
« Last Edit: December 31, 2013, 02:33:41 pm by tonym »

Offline Evandro

  • Newbie
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #13 on: January 02, 2014, 03:06:43 pm »
Hello everyone,
I´ve post a question for Brahin in another topic, but I think that someone here can help me:

-I intend to build an array with 4x4TB DRU and 2x4TB PPU (starting point. Must grow with time) where the great majority of the data will rarely be modified, as it will be used to store videos and music as a backup system). My question is: If I do regularly verification of the role array, does RAID-F protect against “bit rot”? If there is a silent “bit rot”, does the file/checksum get corrected during the verification process?

Thanks in advance.

Offline ezechiel1917

  • Newbie
  • *
  • Posts: 12
  • Karma: +1/-0
    • View Profile
Re: bitrot prior to parity update?
« Reply #14 on: January 02, 2014, 05:19:43 pm »
File checksum is only updated if flexraid detects that file has been modifed.
Validate or verify processes don't update checksums at all, but either help you identify any potential problems with your files by verifying their checksums. And unless bit rot has been propagated to parity or other DRUs of the same parity block have bit rot (very low probability imo) you will be able to recover original file.
« Last Edit: January 02, 2014, 05:24:33 pm by ezechiel1917 »