Author Topic: Understanding the limitations of Snapshot RAID  (Read 12797 times)


Offline xbmcg

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #1 on: January 19, 2012, 09:45:06 am »
One Question: what happens with files open for writing, while snapshot is created? Are they ignored in the parity or are they kept in they last state or considered new (non existent yet)?

Because this files (or databases etc) are open - there is no consistent restore possible if just used for calculation of the parity - they will be restored to an inconsistent state in case of a hardware crash.

What are the risks of making snapshots on running servers? Is the behaviour dependent on enabled "shadow copy" capabilities of the underlying file systems - and what is the impact / advantages / disadvantages when choosing one of the NTFS / FAT32 / EXT2 / EXT3 / EXT4 / HPFS /.../ SMB / NFS etc. and what happens when using a mix ot them? What would be best practise?

« Last Edit: January 19, 2012, 09:47:05 am by xbmcg »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,504
  • Karma: +201/-16
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #2 on: January 19, 2012, 01:37:43 pm »
You should never have a database on the snapshot RAID.

FlexRAID will try to sync all files including open files.

Offline xbmcg

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #3 on: January 19, 2012, 03:15:28 pm »
Thank you.

But what is the impact, if files are changed between 2 snapshots?

Assume we have this config:

DRU 1, DRU 2 and PPU, there are stored File A, B, C, D , E, F, G  -> B is modified to B1 on DRU 1:


DRU1   DRU2    PPU

A        D           PAD
A        D           PAD
A        E           PAE
B        E           PBE
B        F           PBF
B        F           PBF
B        F           PBF
B        F           PBF
C        F           PCF
C        G          PCG

So if B is modified and after that DRU 2 fails:

DRU1  (DRU2)  PPU

A        (D)           PAD
A        (D)           PAD
A        (E)           PAE
B1      (E)           PBE
B1      (F)           PBF
B1      (F)           PBF
B1      (F)           PBF
B1      (F)           PBF
C        (F)           PCF
C        (G)          PCG

you can only restore files D and G, while E and F will be corrupted -
except a snapshot / shadow copy exists of DRU1 from the time of the Parity snapshot
and the parite is built on this shadow copy - so there is a consistent base where
one can restore "back in time". If B changes in length, it would even be worse,
so all subsequent parity calculations would go out of synch.

Is my assumption correct? So a file edit will comromise RAID not only the edited file, but
all files on the other drives, that are used for calculation of the parity together with the
edited file - so the more DRU you have the more files will be corrupted by a single
change of the data - eg: 12 DRU, 1 surviving file was changed: Impact at least 11 Files
not recoverable?
« Last Edit: January 19, 2012, 03:26:22 pm by xbmcg »

Offline Ramshackles

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +8/-3
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #4 on: January 19, 2012, 03:36:46 pm »
You plan on frying 10 DRUs while the 1 DRU failed? You wouldn't need to restore anything from them otherwise?

Otherwise I can only see a problem for the drive who failed?

Offline xbmcg

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #5 on: January 19, 2012, 03:50:12 pm »
Oh , sorry, my fault. Of course only the storage Area around the modified file
on the failed disk will be corrupted, no matter how many DRUs one have.

But what happens if you have 2 PPU, is there a chance to restore the affected area?

assume on restore the program recognizes B was changed after snapshot to B1,

Again, DRU 1 contains B and DRU 2 fails. So all files could be recovered using both PPU's

If you have more DRU's so the chances are good to make a complete restore of the
failed DRU content, as long as the count of "dirty regions" for a single portion of parity
on DRU drives is less than the count of used PPU's and the restore process can recognize,
that a particular region was modified after the snapshot.

Does it make sense?
« Last Edit: January 19, 2012, 03:55:01 pm by xbmcg »

Offline Ramshackles

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +8/-3
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #6 on: January 19, 2012, 04:05:48 pm »
I must admit I haven't got any experience with any other engine than T1+ which only supports 1 DRU (In my case I have actually had 2 configurations with 10 DRU + 1 PPU each) - So I could actually lose 1 drive from each config without any issues at all.

I'm not entirely sure I understand you correctly (English not my native tongue), but only data on the failed drive is in risk of being "corrupted", and only up to the unsynced point.

Say you lost 1 disk and you had edited 1 gb of data out of the 2 tb on it, "only" that 1 gb would be in jeopardy.. But if I understood Brahim correctly some time back, the files could very well be usable, depending on the contents of it anyhow.

Offline xbmcg

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #7 on: January 19, 2012, 04:22:10 pm »
No, it is not exactly the case, I doubt.

You could loose any data, not only the changes of the file, that happened after sync. and the data
lost is not the changed file, but files in the region of the changed file on the failed drive - if the changed
file is on a dru that has survived. So you do not know, what files are corrupt on the restored dru (except
you have hash checksums for each file stored somewhere and you can verify them to the checksum
created after a restore.

If the changed content is ONLY on the failed DRU - you are fine. You can consistently restore back to the time
of the snapshot, because parity AND data or remaining DRU's are still in a consistent state.

The only way around is to use snapshot RAID in conjunction with shadow copies. When you create a shadow copy of the
DRU while creating the parity information, and use this shadow copy of the surviving DRU with the PPU to restore the
failed DRU you will have always a consistent back-in-time result - the state of the DRU in time where the snapshot was taken - no matter if edits happen afterwards on the other DRU's. The point is: DRUs and PPU must be consistent / match in order to be able to calculate the missing data. If one has changed - the result is undefined. With shadow copy you have a "frozen" state matching the state of the PPU.
« Last Edit: January 19, 2012, 04:35:54 pm by xbmcg »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,504
  • Karma: +201/-16
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #8 on: January 19, 2012, 04:33:35 pm »
The restore process will tell you if a restored file is corrupted.

Offline Ramshackles

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +8/-3
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #9 on: January 19, 2012, 05:10:56 pm »
If you are using the snapshot raid as it is supposed (on data archieves there probably won't be many edits and they would probably be small ones, a few kb's here and there seems likely, for archieved data anyway)

My scenario is storage of all my ripped movies and that data never changes, only added to or removed which pose no dangers. If your data is edited frequently you should be using realtime really.

And as Brahim mentioned you will be notified which files are corrupt.

Offline xbmcg

  • Newbie
  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #10 on: January 20, 2012, 01:44:18 am »
Yes, I think it would be a good idea to mount this storages read-only - since xbmc and other htpc software does scans on the media and adds / modifies information - like thumbs, nfo files, last playback times etc.

Even such small changes can corrupt a restore - so it would be a good idea to initially setup the archives and then freeze it read only and then make the snapshot. Any time an update is necessary, it has to be unlocked, changed, locked and snapshot-updated...

Otherwise you just "feel" safe - but you are not.

It is not a bad thing anyway, you just have to be aware of the impacts / risks to be able to mitigate them appropriately or to choose the right solution for your needs.

But any way - is it intended to combine the power of shadow-copy with snapshot RAID? This could massive improve the reliability of a snapshot RAID, since edit data would be no longer an issue.

Offline Ramshackles

  • Jr. Member
  • **
  • Posts: 81
  • Karma: +8/-3
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #11 on: January 20, 2012, 12:27:43 pm »
I use xbmc aswell, and after the initial adding to the library there really is no editing of the files, depends on how you do things i guess, but last played etc is actually saved in the database of xbmc.

Then again my process is downloading to a location outside the pool, getting metadata and the moving to the pool, all done automatically, and xbmc doesnt alter/edit any of the files afterwards.

But I'm sure there are situations or other media players / centers that does.

Offline dscline

  • Sr. Member
  • ****
  • Posts: 272
  • Karma: +6/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #12 on: February 13, 2012, 10:15:56 am »
Yes, I think it would be a good idea to mount this storages read-only - since xbmc and other htpc software does scans on the media and adds / modifies information - like thumbs, nfo files, last playback times etc.

Even such small changes can corrupt a restore - so it would be a good idea to initially setup the archives and then freeze it read only and then make the snapshot. Any time an update is necessary, it has to be unlocked, changed, locked and snapshot-updated...
Late reply here, but... I use MyMovies, not XBMC, but there's a similar issue in that MyMovies can occasionally update metadata files, cover scans, etc.  The simple solution for me is:  exclude those files from the RAID.  Since MyMovies can re-download that data when needed, there's really no reason to need to protect them.
WHS 2011
tRAID final 23 DRUs 2PPUs
Supermicro C2SEA, Q9505s (stock), 4GB
Supermicro AOC-SASLP-MV8
IBM m1015 flashed to LSI 9211-8i/IT + HP SAS Expander
Generic SiI3132

Offline Cassius

  • Full Member
  • ***
  • Posts: 138
  • Karma: +0/-0
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #13 on: April 05, 2012, 02:12:37 pm »
From the wiki: Understanding the limitations of Snapshot RAID

"Deletes will compromise recovery UNLESS the operations are done through the storage pool and FlexRAID’s proprietary recycle bin feature is turned on"

So does that apply to deletes performed through a network share?

Does FlexRAID’s proprietary recycle bin work like Windows Recycle Bin where I can empty its contents? If so, do I have to run an update after emptying the recycle bin?

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,504
  • Karma: +201/-16
    • View Profile
Re: Understanding the limitations of Snapshot RAID
« Reply #14 on: April 06, 2012, 11:58:00 am »
From the wiki: Understanding the limitations of Snapshot RAID

"Deletes will compromise recovery UNLESS the operations are done through the storage pool and FlexRAID’s proprietary recycle bin feature is turned on"

So does that apply to deletes performed through a network share?

Does FlexRAID’s proprietary recycle bin work like Windows Recycle Bin where I can empty its contents? If so, do I have to run an update after emptying the recycle bin?

FlexRAID's recycle bin feature works better than Windows' version from the simple fact that recycle bin is not supported for network operations with Windows.
With FlexRAID, the feature works the same for both local and network operations.