Author Topic: REAL-Time Auto Self-correct Question (to chkdsk or not to chkdsk, that is the ?)  (Read 2406 times)

Offline bigbob

  • Newbie
  • *
  • Posts: 36
  • Karma: +0/-0
    • View Profile
The next best thing, however, is the ability to auto-correct. Whether it be from system crash, user error, bug or whatnot, being able to recover and auto-fix is quintessential.
FlexRAID in Real-Time RAID mode can now self correct without needing to bring the storage pool down. That’s right; FlexRAID can now self correct live! :)
The next step will be to make the RAID self-heal the data itself from data rot stemming from memory errors or failing drive errors. The foundation for this next feature is already in the Snapshot RAID implementation. It is now just a matter of bridging the two to close the circle.

Ok, I dashed my "expert mode"  config and reinstalled beta 13 last night using a Cruise Control Storage Pool with Real-Time RAID (RTR) on a Windows 7 32bit, SP1 setup.  2GB ram, 3GHz C2D Intel Proc rig last night.  The whole server configuration is below.

Machine has been rock stable with the all the previous versions since 1.7 until now after I installed Beta 13.  I've had 2 BSOD in the last 24 hrs.  The last one occurred while Windows Media Center (WMC) was running. 

The first BSOD occurred right after FlexRAID finished installing before I had a chance to do anything else.  Unfortunately, I forget the first BSOD error message but it was different from the second one.  The only customizations I made to the install were that I changed the value of “Removable Drive” to false and the “file system” to NTFS so that WMC could see directories in the pool as per Brahims guidelines.  WMC does not record live TV into the Pool.  The data in the pool is, for the most part, static.  After it finished initializing the RTR pool this evening, I used the RTR Cruise Control to create server shares of the directories in the pool that contain media (pics, tv, video, etc).

Then I started WMC and added the media directories as “locations” into WMC. 
The second BSOD occurred as it was searching / cataloging those directories.  The error was “A thread tried to release a resource it did not own". 

So first thing I did after it rebooted was check the FR log
[2012-01-17 20:10:49,177] ERROR: Invalid entry detected...
[2012-01-17 20:10:49,177] ERROR: Please run the reconcile operation

I was very distraught at another 24 hr reconcile cycle.   >:(   :'(

Then I checked to see if the pool was running and of course, it was not, but the drives were thrashing like crazy, like a reconcile had already started.  :o

I’m happy to report after a few minutes (six I guess from the log) my pool started up on it’s own.  YEA! Go Brahim! Go Brahim! Go Brahim!    :)  :D  ;D
[2012-01-17 20:16:45,024] INFO : Using Auto-View definition...
[2012-01-17 20:16:45,024] INFO : Verifying real-time RAID previous exit state...
[2012-01-17 20:16:45,024] INFO : Building View Handler...
[2012-01-17 20:16:45,368] INFO : Mounting Storage Pool drive...
[2012-01-17 20:17:04,181] INFO : Reconnecting "Pictures-Home Vids-Graphics-Scans"="V:\FlexRAID Drive E\Pictures, Home Vids, Graphics, Scans" /GRANT:Everyone,CHANGE
[2012-01-17 20:17:04,618] INFO : Reconnecting "Our Files-Docs_Dwnlds_Email_etc"="V:\FlexRAID Drive N\Our Files  (Docs, Dwnlds, Email, etc)" /GRANT:Everyone,CHANGE
[2012-01-17 20:17:04,743] INFO : Reconnecting "Audio Books"="V:\FlexRAID Drive O\Audio Books" /GRANT:Everyone,CHANGE
[2012-01-17 20:17:04,977] INFO : Reconnecting Music="V:\FlexRAID Drive O\Music" /GRANT:Everyone,CHANGE

Normally, anytime I get a BSOD on any machine, I immediately run a chkdsk.  So here’s my question.  Brahim, do the new and great recovery features (that’s not sarcasm, I mean it and I’m very impressed) take care of any file errors caused by the BSOD?  Or should I still stop the pool and stop the FlexRAID service and run chkdsk on the drives?  You said to ask before I do anything so here I am, humbly asking.

Config details if you’re interested.
This is primarily used as Windows Media Center for recording TV with 3 tuners.  I watch TV on the computer's own HD monitor and also via 2 xbox extenders.  The drive media center records to is dedicated wholly to that and is NOT part of the RAID or Pool.

The drives assigned to the RTR pool contain static data like pics, music, docs, and previously recorded TV video I want to keep.

Win7 32bit Premium
2GB RAM,
3GHz C2D Intel Proc.

There are 8 physical Disks, 6 are part of the RTR Pool.  The first drive is partitioned into 2.  One partition contains the OS and one partition is in the pool.

Disc   Partition#   Use
0   1      OS

0   2      DRU 1
3   1      DRU 1

4   1      DRU 2

7   1      DRU 3
5   1      DRU 3

6   1      PPU 1

2   1      Recorded TV
1   1      Backups
« Last Edit: January 18, 2012, 08:17:04 am by bigbob »

Offline bigbob

  • Newbie
  • *
  • Posts: 36
  • Karma: +0/-0
    • View Profile
3rd BSOD 2 hrs later while trying to stream out of pool with WMC to extender Xbox.  No bueno. No auto recovery this time.  Reconcile required.  Anyone having issues in beta 13?

Offline naeonline

  • Jr. Member
  • **
  • Posts: 99
  • Karma: +5/-0
    • View Profile
You need to enable TRACE in your logging (See Wiki) and upload full logs using something like dropbox (your full logs will likely be too large to upload here).

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
@bigbob

1. Open up a bug report with the min-dumps
2. Yes, the auto-self correction feature has not been included yet (I am currently collection data from usage reports).


Offline bigbob

  • Newbie
  • *
  • Posts: 36
  • Karma: +0/-0
    • View Profile
Brahim, thanks for the reply. I'll work on those dump files

Is there a way to run chkdsk without deleting the pool or removing the drives from the pool?  The pool was already dead of course and not running.  I stopped FR and then tried the following with no success:
in Win7, properties/tools/ check for errors - nothing happens (I verified it still works on drives not in the pool)
I rebooted, F8 into safemode with cmd prompt, and the pool drives don't show up at all - can't get into them in any way
I rebooted, used Win7 install disk to get into repair mode (Win7 wanted to do some sort of auto correction so I canceled out of that), went to next step and found nothing useful
booted with pmagic (linux based utility disk), half of drives in the pool show unallocated in the test disk app but I can still see the files in the file browser like normal ??? (I'm not a linux expert so this might just be me)

Anyhow, I'm guessing you've thought through this but whatever you do to lock the drives down for the pool is very successful at locking them down.  I assumed it was a software trick but for me it seems to be hardware based in some way since I can't get into the drives in any of my normal ways. 

I don't want to try this right now, but I'm assuming there is no option to defrag the disks either while in the pool?  Or would running defrag on the virtual pool drive work?

Thanks for your effort.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
1. You are going about it the hard way.

2. You can run chckdsk, scan, and others only on the source drives and never on the pool.
The program you use should be able to see the source disks as folder mounted drives.

3. Nothing is hardware locked. You can pull any of your drives and plug it in another computer at any time.

4. You can revert to the old driver if it proves to be more stable for you.


Offline bigbob

  • Newbie
  • *
  • Posts: 36
  • Karma: +0/-0
    • View Profile
I hate to be dense.  For me, since the pool is down, the drives only show up in the Computer Management/disk management program.  (rt clk my computer, choose manage, then disk mgmt).  While in there, I can right click on each drive, then choose properties, then tools, and finally error checking.  If you initiate the error check process on any drive NOT in the pool it functions normally and runs the chkdsk program.  If I click on "check now" on any disk in the pool, nothing happens.

Are you not seeing the same behavior with the built in MS tools?  If you are, what program do you recommend for use with drives in the pool?

Can you make a recommendation for a defragger as well?  I normally use Auslogics Disk Defrag and it doesn't see the pool drives either.  I used it on all the drives before I created the pool just fine. 

Thanks again.  I hope my challenges are helping you develop a better product and not just wasting your time.