FlexRAID Support Forum

FlexRAID™ Transparent RAID™ (tRAID™) => General Discussion => Topic started by: Brahim on July 08, 2013, 12:09:13 pm

Title: Verify & Sync task vs Create Parity task
Post by: Brahim on July 08, 2013, 12:09:13 pm
Verify & Sync task vs Create Parity task

An interesting discussion took place in the Transparent RAID RC1 thread.

A number of users have been reporting that the Verify & Sync task was running much faster than the Create Parity task when used to initialize an array.
Well, it turns out that this is to be expected.

I indeed designed the Verify & Sync task to be faster than the Create Parity task from a theoretical perspective.
I really wasn't sure how it would translate in practice, but mathematically, it is supposed to be faster.

The theory is as follow:
1. That users will typically have significant unused disk space that has never been written to and is effectively all zeros
2. That even for space where data has been written to, there is great probability that the disks will have blocks of matching data

What I was not sure of was the overhead impact of first checking the data for match prior to writing where there is mismatch.
In the Verify & Sync task, the parity disks are subjected to both read/write operations decreasing their overall write performance.
In contrast, in a Create Parity task, the parity disks only execute write operations maximizing the write throughput.

I was not certain of the net effect and what task that net effect would favor.
Well, per current user reports, the net effect is favoring the Verify & Sync task. That makes sense for cases where the Verify & Sync task does significantly less writes than the Create Parity task.

When initializing an array made up of fairly new disks, the Verify & Sync task is guaranteed to be faster.

So, when initializing or wanting to re-create the parity data, do consider using the Verify & Sync task instead as it might be significantly faster.
Title: Re: Verify & Sync task vs Create Parity task
Post by: SirMaster on July 08, 2013, 12:46:12 pm
So you are saying it turns out to be faster to check and write only if mismatch than to always write?

I would have though blindly write only would be faster than read and write.

Because for blind write, you have to write 100% of the disk.  Read/Write you have to read 100% of the disk and write maybe 50% on average depending on the data.  I though read and write speeds for mechanical disks were fairly close in speed.

I guess the way that you have to write is not the same as a sequential read and write speed test.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 08, 2013, 01:31:06 pm
Yep. You would think blind writing would be faster.

Nevertheless, something clicked in me a while back even though I was not 100% sure of my hunch.
I just implemented it, made it an option for RAID initialization, but never really benchmark'ed it (got busy with other things).

This is one of those things where you can theorize all day, but testing will hold the final say. :)
Plus, I do have some optimization in my implementation that I think are making a difference here.

Another motivation for implementing it is that it creates less wear and tear on SSDs.
Title: Re: Verify & Sync task vs Create Parity task
Post by: NLS on July 08, 2013, 01:48:46 pm
You realise that if a method is both better and faster, the other should just go (or remain as admin option).


Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 08, 2013, 02:27:12 pm
You realise that if a method is both better and faster, the other should just go (or remain as admin option).
No, "I'll rather be a slow turtle so I don't get run over".  ::)

Focus. We need to collect more data so that better recommendations can be made.

For one user here things were much faster only because he had already created parity on those drives.
What we need to do is collect stats for users coming in with fresh new drives and for users coming in with greatly used up drives.
There cases where the Verify & Sync task will simply be slower. The scenarios users are going to encounter in practice are what matter.

No jumping to conclusion yet. This thread is more of "food for thought" king of thing and to recommend giving the "Verify & Sync" task a shot.
The Create Parity task is predictable and will perform the same regardless of the drives contents, which is why it is the one recommended in the tutorials.

Title: Re: Verify & Sync task vs Create Parity task
Post by: monkeysez on July 08, 2013, 03:33:30 pm
When running the verify and sync command, at the end of it I received the following message. Does that mean that parity was not created properly?
Title: Re: Verify & Sync task vs Create Parity task
Post by: NLS on July 08, 2013, 03:40:21 pm
No it's what it synced.

Title: Re: Verify & Sync task vs Create Parity task
Post by: monkeysez on July 08, 2013, 03:48:44 pm
No it's what it synced.

Gotcha. As a suggestion, I would add some language to clarify that verify/sync was successfully completed.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 08, 2013, 04:21:29 pm
Gotcha. As a suggestion, I would add some language to clarify that verify/sync was successfully completed.
Done. :)
Title: Re: Verify & Sync task vs Create Parity task
Post by: NLS on July 08, 2013, 04:23:40 pm
I think I remember someone saying about this before. I might be wrong.
Title: Re: Verify & Sync task vs Create Parity task
Post by: facke02 on July 08, 2013, 07:37:21 pm
During my testing, Verify Sync was 3 times faster then Create Parity.  I've run each about half and half when I recreate the Raid.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 08, 2013, 08:41:24 pm
What would be more meaningful is for everyone to post their Verify Sync task report.
What I want to see is the number of reported 4K blocks processed.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Gravis856 on July 09, 2013, 11:27:33 am
I'll post mine when it finally finishes. Started/stopped Create parity getting 95 MB/s and now am getting ~328 MB/s with Verify & Sync.

Edit: And 5 hours later its slowly dropped to 35 MB/s. I might abort it.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 10, 2013, 04:11:36 am
I'll post mine when it finally finishes. Started/stopped Create parity getting 95 MB/s and now am getting ~328 MB/s with Verify & Sync.

Edit: And 5 hours later its slowly dropped to 35 MB/s. I might abort it.
Yikes! 328MB to 35MB?
Something is majorly wrong here.

The absolute worst case scenario for Verify & Sync is that it might get 1/3 slower than Create Parity.
However, that's for the most extreme case (very unlikely scenario).

How many PPUs do you have?
Title: Re: Verify & Sync task vs Create Parity task
Post by: Gravis856 on July 10, 2013, 04:17:52 pm
Yikes! 328MB to 35MB?
Something is majorly wrong here.

The absolute worst case scenario for Verify & Sync is that it might get 1/3 slower than Create Parity.
However, that's for the most extreme case (very unlikely scenario).

How many PPUs do you have?

One 2TB PPU for four 2TB DRUs. Four of them being WD Greens and one Seagate. I just went back to RAID-f for the time being and got about ~69MB/s steady til then RAID was initialized.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 10, 2013, 05:07:12 pm
Ok. Once we start talking about performance tuning, some of these tests will be re-evaluated.

I am suspecting that some of the changes to ensure greater data integrity is affecting different systems differently.
I would have been great if we could bypass this crippling during RAID initialization. However, many of the settings are system wide and cannot be turned on and off without rebooting.
 
Title: Re: Verify & Sync task vs Create Parity task
Post by: priley2327 on July 10, 2013, 08:44:06 pm
No, "I'll rather be a slow turtle so I don't get run over".  ::)


But if your a slow turtle and in the middle of the road what happens?
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on July 10, 2013, 08:57:26 pm
But if your a slow turtle and in the middle of the road what happens?

/facepalm
Title: Re: Verify & Sync task vs Create Parity task
Post by: NLS on July 11, 2013, 02:15:43 am
Hahahahahahaha!
Title: Re: Verify & Sync task vs Create Parity task
Post by: vletroye on October 24, 2013, 04:06:51 pm
I come back on this topic...

I am wrong or it's not possible anymore with RC10 to use "Verify & Sync" to initiate the Parity ?
I did try to start a new array to create its parity, then abort and finally use "Verify & Sync" but this one was disabled :(

To initialize the Parity (I did it twice), it took 33 hours with a mean throughput of 240MB/s (and a max of 252MB/s).
I am now running a "Verify & Sync". It's running with a mean throughput of 450MB/s. It should complete, without anything to sync, in less than 20hours - But I will report ;).

I would like next to remove one DRU (which is 100% empty), recreate the array from scratch and initialize it using "Verify & Sync"... But it will only be possible if I find how to do that with RC10.


Also, I have ordered a new 3TB disk. When it will arrive, I will add it into the array, move some data from the other disks onto that one and could try to recreate the parity using "Verify & Sync"... I will also report on that test here...

V.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on October 24, 2013, 05:03:42 pm
What do you mean it is not possible?

The options the initialization panel gives you are pretty clear and include Verify & Sync.
Post a screenshot of what you claim is disabled.
Title: Re: Verify & Sync task vs Create Parity task
Post by: vletroye on October 25, 2013, 12:09:18 am
When one wants to initialize the parity on a new array with Verify&Sync, one has first to click on start, then abort and finally click on Verify&Sync (That's how I used to do it with the early RC).

But now the Verify&Sync button keeps disabled after aborting the initialization ?!
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on October 25, 2013, 08:19:33 am
When one wants to initialize the parity on a new array with Verify&Sync, one has first to click on start, then abort and finally click on Verify&Sync (That's how I used to do it with the early RC).

But now the Verify&Sync button keeps disabled after aborting the initialization ?!
What's the matter with some of you in coming up with some the craziest schemes ever?  ;D
How about not abort after clicking start and being shown the proper initialization panel?  ::)

(http://wiki.flexraid.com/wp-content/uploads/2013/06/15.-RAID-Init-Options.png)

It might help to go back to the wiki for a refresher. ;)
http://wiki.flexraid.com/2013/06/27/creating-a-transparent-raid-array/
Title: Re: Verify & Sync task vs Create Parity task
Post by: vletroye on October 25, 2013, 12:46:51 pm
?? I am lost....

What I want to validate is that "Verify&Sync" is faster than the "parity initialization"...

On the wiki it seems that the button "Verify&Sync" is enabled even if the array has not yet been initialized ?!
But it's not the case on my server ?!
Reason why I was trying to use this dirty trick found ages ago  (and not working anymore) :-[

Could you confirm that "Verify&Sync" is enabled on your machines for newly created arrays ?
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on October 25, 2013, 01:31:46 pm
I had placed an image, which was not showing.
Fixed. See the post above. Initialization is done on the "Initialize RAID" panel.
What you do anywhere else does not have the same meaning.

All array must be initialized.
If you want the different initialization options, you can un-initialize the configuration through the advanced operations menu.

Now, it has been discussed in this very thread of how Very & Sync is faster only when you had parity created before using the same disks. On a fresh new set of disks that have not been through  tRAID, the regular Create task should be faster.
Title: Re: Verify & Sync task vs Create Parity task
Post by: vletroye on October 25, 2013, 03:07:31 pm
 :o

I saw that screen about one thousands of times and never really payed attention on the 2 first entries  :-[
Incredible how a brain can filter information  ::)

Thx a lot for your patience  :D

V.
Title: Re: Verify & Sync task vs Create Parity task
Post by: Brahim on October 25, 2013, 04:09:48 pm
:o

I saw that screen about one thousands of times and never really payed attention on the 2 first entries  :-[
Incredible how a brain can filter information  ::)

Thx a lot for your patience  :D

V.
lol!  ;D
Oh, I believe on how it can happen.
I sneaked up and added those while you weren't looking.  :P
Title: Re: Verify & Sync task vs Create Parity task
Post by: erfa on November 09, 2013, 07:45:42 pm
Recently I switched from RAID-F to tRAID, after replacing my OS (from win7 to win8.1). I've been using RAID-F for about a year and a half, and I haven't really had any problems with it. My only issue was that it took a lot of time to get it up and running and to figure out some of the features. So before setting up tRAID I looked into several alternatives, mainly Windows Storage Spaces and StableBit Drivepool. Storage Spaces was particularly interesting since it was a free option in win8. But I was quickly disappointed, I guess from hearing about it I had bigger expectations. So next I tried Drivepool, and I have to admit that I really liked how simple it was. They have really nailed the interface and ease-of-use. I think even my parents might be able to use it without any help. Unfortunately it only has duplication of data, no parity. Which doesn't work well for me, IMO it's too much of a waste of space. FlexRAID is by far a more flexible solution than any other I have looked into. My only issue still remains the usability of the product.

Now back to the main reason I'm writing this post. After going through all the necessary wiki pages to figure out how to set up my tRAID pool, and following the instructions to the letter, I was taken by surprise near the end of my setup. As recommended I chose to "Create RAID Parity" when initializing the RAID (the other available options were "Do nothing" and "Verify & Sync"). I had to wait 11 hours for it to finish. 11 HOURS?! I can understand if the data disks had a lot of data on them and the parity had to be recreated, but all the disks in my array were empty and formatted. So what gives? Is this normal? What is it doing during all that time when there is no data on the disks?

Another thing that seemed a bit odd to me was that "total size" = 10.916TB in the report (screenshot attached). But I was more curious about the issue above that I didn't give it any further thought. What is the total size supposed to be? It can't be the size of the parity created or the total amount of data on the DRU's, or even the sum of capacity on all the disks in the array.

Oh I almost forgot, here is some info about the PC I'm running tRAID on. I plan to continue using these disks, and gradually replacing the DRU's with WD Red 4TB. Now my fear is that it will take many days to recreate the parity with 4x 4TB.

DRU1 - 1 TB (931.513 GB) - WD Enterprise 7200 RPM
DRU2 - 1 TB (931.513 GB) - WD Enterprise 7200 RPM
DRU3 - 2 TB (1.819 TB) - Seagate Barracuda Green 5900 RPM
PPU1 - 4 TB (3.638 TB) - WD Red 5400 RPM

CPU: i5 2500 3.3GHz
RAM: 2x 4GB
OS: Win 8.1 Pro 64-bit (running on Samsung 840 Pro SSD)

Sorry btw. for the lengthy post :)
Title: Re: Verify & Sync task vs Create Parity task
Post by: terrastrife on November 09, 2013, 08:56:47 pm
Now back to the main reason I'm writing this post. After going through all the necessary wiki pages to figure out how to set up my tRAID pool, and following the instructions to the letter, I was taken by surprise near the end of my setup. As recommended I chose to "Create RAID Parity" when initializing the RAID (the other available options were "Do nothing" and "Verify & Sync"). I had to wait 11 hours for it to finish. 11 HOURS?! I can understand if the data disks had a lot of data on them and the parity had to be recreated, but all the disks in my array were empty and formatted. So what gives? Is this normal? What is it doing during all that time when there is no data on the disks?

Another thing that seemed a bit odd to me was that "total size" = 10.916TB in the report (screenshot attached). But I was more curious about the issue above that I didn't give it any further thought. What is the total size supposed to be? It can't be the size of the parity created or the total amount of data on the DRU's, or even the sum of capacity on all the disks in the array.

You are creating parity for the disk/volume, not the data, hence disk full or empty, it will take the same time. Creating initial parity is the same for any form of RAID, you will need to read all the 'space' in the array to create parity.

the 'total sized' is based on your parity disk size multiplied by how many DRU you have.
Title: Re: Verify & Sync task vs Create Parity task
Post by: erfa on November 10, 2013, 03:42:26 pm
You are creating parity for the disk/volume, not the data, hence disk full or empty, it will take the same time. Creating initial parity is the same for any form of RAID, you will need to read all the 'space' in the array to create parity.

the 'total sized' is based on your parity disk size multiplied by how many DRU you have.

Ah, I see. Thanks for the quick and precise answer! My ignorance on the matter is totally at fault :)

It was mentioned in this thread that the Verify & Sync task was faster at creating the initial parity, so I decided to try it out just for fun. I've added a screenshot of the result in case anyone is interested to compare with the previous one, it's exactly the same setup, the only difference is the task I chose (Verify&Sync vs Create Parity). As you can see it took 5 hours and 50 minutes to do Verify & Sync, compared to 11 hours with the Create Parity task. Since it's such a big difference, I'm left wondering if it's safe to continue using this raid or if I should delete it and create a new one using the Create Parity task. Any thoughts?
Title: Re: Verify & Sync task vs Create Parity task
Post by: vletroye on November 11, 2013, 02:35:56 am
It assumed to be 100% safe (it's something I have tested several times yet without any issue).
The difference is due the very large amount of parity bits already correct and that didn't need to be rewritten.

How faster it will be depends on the hardware configuration as far as I have experienced.
E.g: if the bus/disks are slow and many data are read and written in parallel, avoiding to write will boost the process.

But here are more details provided by Brahim:

Quote
Verify & Sync can potentially be faster when:
- All disk are brand new disks (no data was ever written to them)
- When recomputing parity for a set of disks that used to be in an array of the exact same disk configuration (say, you deleted the old config, took some disks online, and now want to re-create the array same as it was before)

Outside of those two cases, the penalty for using Verify & Sync could be severe.


On my server, "Create-Parity" takes 9h30 for 16TB used as DRU and 2x3TB used as PPU (all DRU on one LSI controller and all PPU on another distinct LSI controller).
A Verify&Sync is not faster... This is IMO logical as the PPU are on a distinct controller and are therefore not an IO bottleneck for the "Create-Parity" process...
Title: Re: Verify & Sync task vs Create Parity task
Post by: dscline on December 01, 2013, 04:32:02 pm
I (believe) I understand the reasons for the performance differences between the two scenarios, but beyond the speed difference of Verify+Sync vs. Create, there's a usability difference.  From the wiki:

Quote
Another purpose for this option is to delay the parity computation till after the RAID as been deployed to minimize down time. With the normal RAID initialization process, you won’t have access to your data till the parity computation is complete. In contrast, with this option, one can deploy the array making all data available right away for use (read/write) and then have the parity computation take place in the background by running the Verify/Sync task later on.

So for a large amount of data, I could see a benefit of using Create+Verify, even if it takes longer.  But if you can access the data during a create/verify, why can't you do it during a create?
Title: Re: Verify & Sync task vs Create Parity task
Post by: pclausen on August 10, 2014, 06:14:00 am
Here are some real world examples of doing both.  tRAID1 contains 17.3TB of data out of 34.6TB and tRAID2 contains 22.5TB out of 37.3TB.

Create Parity task:

(http://www.cstone.net/~dk/tRAID2100percent.JPG)

Verify & Sync:

(http://www.cstone.net/~dk/verifysync.JPG)

So the throughput is higher during Create Parity task, so much so that it took 7 hours less time to complete.

Not sure why there is a throughput difference between tRaid1 and tRAID2.  Both arrays have dedicated IBM 1015 controllers, identical SAS backplanes and an identical pair of PPUs (Seagate .15 4TB units).

Btw, I did the Verify & Sync tasks with both arrays online and was reading and writing data to both arrays without any issues.