Author Topic: RT RAID: From experimental to stable initiative (part 2)  (Read 8801 times)

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
RT RAID: From experimental to stable initiative (part 2)
« on: October 03, 2012, 02:51:41 pm »
Part 1: http://forum.flexraid.com/index.php/topic,633.0.html

I need users to post about their current issues with RT RAID

1. Make sure you have adhered to these posted guidelines: http://wiki.flexraid.com/2012/06/10/real-time-raid-dos-and-donts/
2. Download the most current testing release and patch in thread to see if your current issues are resolved.

Current test release:
http://download.openegg.org/temp/FlexRAID-2.0-Final_u11t5.exe

Current test patch [download and replace the file in <install root>\ext (make sure to stop FlexRAID first)]:
None.


« Last Edit: October 07, 2012, 10:00:48 pm by Brahim »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #1 on: October 04, 2012, 10:40:11 am »
Fragmentation report task:
FlexRAID has an undocumented task for RT RAID that prints in the logs information about fragmented files.
This task has not been incorporated in the WebUI yet and can only be executed from the CMD clients: http://www.flexraid.com/download/#download

The command to run using the CMD client is: live class1_0 fragmentation-report

Only files that are heavily fragmented (let's say more than 1000 fragments) are of concern. Reducing fragmentation helps both with resource utilization and performance.
If you have a file that is heavily fragmented, you should move it out of the array (copy and then delete) and then move it back into the array using Windows Explorer (or any tool that preallocates space before copying files).

As posted on the wiki:
Quote
Avoid TeraCopy! TeraCopy fails to pre-allocate files and creates unnecessary fragmentation. Instead, and if you are not a fan of Windows Explorer, use tools like UltraCopier, FastCopy and KillCopy (make sure to configure them to pre-allocate on file copy if that’s not the default)

« Last Edit: October 22, 2012, 09:50:10 pm by Brahim »

Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #2 on: October 15, 2012, 05:49:05 am »
As I detailed in this thread, I had a problem with Reconcile saying it was successful, but then being asked to perform a forced-sync-verify.   http://forum.flexraid.com/index.php/topic,1254.0.html


This was on 2.0 Final Update 10r2.  I installed 2.0-Final_u11t5 and when I restarted I was able to start the pool, no read-only, no need for reconcile, no need for forced-sync-verify.

Nice work

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #3 on: October 15, 2012, 08:11:31 am »
DrBlaze, please do either re-create the RAID or run forced-sync-verify one more time after migrating to the new build from 10r2.


Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #4 on: October 16, 2012, 08:23:07 am »
Yes, my enthusiasm was premature.  It wasn't working properly.

So, Reconcile(12 mins) needed to be run before Forced-Sync-Verify.  When that completed I restarted the service and tried to start the pool but was told that Reconcilliation was required.  I did it but then was told to Force-Sync-Verify because of pending deletes.

I'm going to re-create and start fresh.

Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #5 on: October 20, 2012, 12:50:51 pm »
So, I've kept the logs running in trace since the Re-create, have copied ~100GB into array with no prob, also done a bit of moving directories around etc.

Today I decided to clean out my "to delete" folder.  The 1st delete went fine, I waited a few extra minutes after FR was done updating the parity, but the 2nd delete failed.  There were a handful of smaller files and one ~15GB.  When I hit delete it said deleting ~200MB and updated immediately, it seemed to lose track of the larger file (parity should have taken a few minutes).

This is the identical problem I had a few days ago which led to the last Re-create (because of the ongoing problem with Reconcile not working).  So I am now in read-only mode.  When I eventually close the pool and try to open it again it will ask me to Reconcile (which likely won't work, followed by Force-Sync-Verify).

Here are the logs in trace for the delete error, I am not doing anything further til I hear from you in case you have a patch or something special you'd like me to test.

https://rapidshare.com/files/4168077818/FlexRAID.log

https://rapidshare.com/files/1354327617/FlexRAID.sys.c1-1.log

(all the extra activity in FlexRAID.sys.c1-1.log is utorrent reading files, it does not, nor has it ever dl directly to the array)

 
« Last Edit: October 20, 2012, 12:55:28 pm by DrBlaze »

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #6 on: October 21, 2012, 08:26:07 am »
@DrBlaze
Actually, Reconcile should work better this time around. Just remember to restore the DBs.
I will take a look a the logs.

Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #7 on: October 21, 2012, 10:49:12 pm »
Restored the DB from the Re-create a few days ago, ran 1-hour Reconcile.  I'm afraid I noticed no difference from all the other attempts:

[2012-10-21 23:25:07,525] INFO : *** Starting workers... ***
[2012-10-22 00:31:26,719] INFO : RealTime RAID reconciliation success!
[2012-10-22 00:31:27,150] WARN : Uncommitted pending deletes detected: please run the Forced Synch Verify task!

Logs, in trace:
https://rapidshare.com/files/2729522256/FlexRAID.log
https://rapidshare.com/files/2299777552/FlexRAID.sys.c1-1.log

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #8 on: October 21, 2012, 11:21:14 pm »
This is the expected behavior.
Reconcile cannot deal with failed pending deletes and you will need to run forced sync verify.

The current focus now is improving the handling of pending delete operations.

Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #9 on: October 22, 2012, 11:54:47 am »
So I did the Force-Sync-Verify and was immediately told upon trying to start the pool that a Reconcile is required (which will tell me about pending deletes and please run Force-Sync-Verify, etc, etc, etc,).  So unfortunately u11t5 does not seem to improve the Reconcile feature and Force-sync-Verify is now broken (It's failed twice in the past few days).

My only option is to Re-create again.  Since somewhere in the past couple of updates Deleting has become a problem, and now Force-Sync-Verify doesn't work I may just re-create with u9, which was the most stable for me.  The only failures I had during that time were with hardware.  It had a bug moving folders, but that was just a minor annoyance, and never caused any failures for me.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #10 on: October 22, 2012, 02:50:14 pm »
So I did the Force-Sync-Verify and was immediately told upon trying to start the pool that a Reconcile is required (which will tell me about pending deletes and please run Force-Sync-Verify, etc, etc, etc,).  So unfortunately u11t5 does not seem to improve the Reconcile feature and Force-sync-Verify is now broken (It's failed twice in the past few days).

My only option is to Re-create again.  Since somewhere in the past couple of updates Deleting has become a problem, and now Force-Sync-Verify doesn't work I may just re-create with u9, which was the most stable for me.  The only failures I had during that time were with hardware.  It had a bug moving folders, but that was just a minor annoyance, and never caused any failures for me.
Huh!  ???
I would have to check if I did indeed include the improvements I am thinking of in t5.
Logs?


Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #11 on: October 22, 2012, 04:46:30 pm »
https://rapidshare.com/files/82216405/FlexRAID.log

https://rapidshare.com/files/1677651309/FlexRAID.sys.c1-1.log

I'm going to do the Reconcile so I can at least run the pool in read-only mode.  I'll wait a day or so before I bite the bullet and Re-Create, just in case you come up with a different option than downgrading. 

Offline DrBlaze

  • Sr. Member
  • ****
  • Posts: 281
  • Karma: +14/-0
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #12 on: November 03, 2012, 08:48:26 pm »
Quote
Huh!  ???
I would have to check if I did indeed include the improvements I am thinking of in t5.
Logs?


Can you give me an update on this? I've been in read-only mode this whole time.  I've got ~1TB to add to the array, so I need to re-build.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,525
  • Karma: +202/-16
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #13 on: November 03, 2012, 09:38:19 pm »

Can you give me an update on this? I've been in read-only mode this whole time.  I've got ~1TB to add to the array, so I need to re-build.

I will try to push build 11 out tomorrow.

Offline bisk

  • Full Member
  • ***
  • Posts: 103
  • Karma: +1/-2
    • View Profile
Re: RT RAID: From experimental to stable initiative (part 2)
« Reply #14 on: November 07, 2012, 11:00:23 am »
I will try to push build 11 out tomorrow.

Any plans to push U11 soon?