Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - tonym

Pages: [1]
1
Snapshot RAID / Quick-validate error: ZipException
« on: December 12, 2017, 09:16:30 am »
All of a sudden, FlexRAID has stopped working for me, I'm getting the following error message:

[2017-12-12 14:58:30,736] INFO : Executing quick-validate...
[2017-12-12 14:58:30,736] INFO : [quick-validate] started at: Tue Dec 12 14:58:30 GMT 2017
[2017-12-12 14:58:30,774] ERROR: [quick-validate] error: ZipException[Not in GZIP format]
java.util.zip.ZipException: Not in GZIP format
   at java.util.zip.GZIPInputStream.readHeader(Unknown Source)
   at java.util.zip.GZIPInputStream.<unknown>(Unknown Source)
   at java.util.zip.GZIPInputStream.<unknown>(Unknown Source)
   at com.tchegbe.lib.b.a.a.a.<unknown>(Unknown Source)
   at com.tchegbe.lib.b.a.a.a.<unknown>(Unknown Source)
   at com.tchegbe.lib.b.a.a.a.a(Unknown Source)
   at com.tchegbe.lib.io.a.g.c.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.a.g.c.a(Unknown Source)
   at com.tchegbe.lib.io.a.g.c.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.a.g.c.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.a.e.x.a(Unknown Source)
   at com.tchegbe.lib.io.a.e.c.a.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
[2017-12-12 14:58:30,775] INFO : [quick-validate] completed at: Tue Dec 12 14:58:30 GMT 2017

I've deleted and re-installed both FlexRAID and my configuration; neither rectified the problem. It feels like my parity and/or metadata has become corrupt.

Any suggestions please


2
Snapshot RAID / Persistent Validate and Verify Errors; a RAID-F 2.1 bug
« on: February 14, 2016, 11:04:09 am »
For over two months now, I have been plagued by Validate and Verify Errors; which co-incidentally is from around the time I upgraded to FlexRAID-2.1-Final-2015-11-01. I've been using FlexRAID since before it became commercial; never had these problems before.

I'm running RAID-F and Storage Pooling in Expert mode on a dedicated (underline dedicated) Windows 7 Home x64 based system. I have turned of my AV software, turned off Windows Update, so there should be nothing left to interfere with my RAID disks. I am 100% confident that my disks are all good (S.M.A.R.T. tests etc.), and that none of my files are corrupt; I can check any that FlexRAID flags as corrupt using my own independent PAR2 files.

After more Re-Creates than I care to recall, and far too many days of controlled experiments, I can now almost predict when I will get a Validate failure.

I run Re-Create (takes 21 hours), followed immediately by full Verify (another 22 hours). Everything is fine.

Add a new sub-directory e.g. L:\Data1\Newdir1\, say a dozen files total around 4GB, all to the same physical disk
Run Update followed by Validate, and everything is fine.

Add a new sub-directory, but this time to a different disk e.g. M:\Data2\Newdir2, again say half a dozen files around 4GB. Note this is to a different physical disk.
Run Update followed by Validate, and then FlexRAID flags one or two of the files in L:\Data1\Newdir1\ as corrupt, i.e. files added before the previous Update, and which had already checked out OK by the previous Validate. But they're not corrupt; they check out using my PAR2 files.
(I have a suspicion that this second batch of files doesn't need to be to a different disk, just different sub-directory e.g. L:\Data1\Newdir2; but I'd need 48 hours elapsed to re-test and prove this.)

I note that a few others have created posts for unexplained Validate/Verify errors. To me, this feels like a bug in FlexRAID 2.1. I have hit the buffers, I can't think of anything else I can do to resolve this problem. And besides, I have spent far too much time already trying to pin it down, not just hours, but days.

So, and this is a question aimed at Brahim mainly, what do I do? Raise a bug report? Go back to 2.0?

Thanks in anticipation...

3
General Discussion / Help please with a simple Expression Script
« on: December 30, 2015, 01:10:00 pm »
I'm try to make use of Expression Scripts; but I seem to be misunderstanding something fundamental. For test purposes, I have created a very simple script, as follows:

@config Set01
@queued true
@execute @task quick-validate

I then use the Scheduler to run it; but it fails with the error message:

[2015-12-30 18:55:00,362] ERROR: Duplicate entry error for RAIDConfigModel
java.sql.SQLException: Duplicate entry error for RAIDConfigModel

Nothing's duplicated as far as I can see. Any suggestions please?

4
Snapshot RAID / Verify fail; but why, which drive and file?
« on: November 24, 2015, 05:13:50 am »
Any comments/suggestions on what has gone wrong and how to identify the corrupt drive and file would be appreciated...

I've just run Verify, and it failed part way through with a long list of error messages, almost all being "ERROR: INFO: healthy file, filename". But I assume the more significant error message being the "ERROR: At byte" or possibly the "ERROR: NOTE: messages". Simplifying my log, here goes:

 ERROR: Verify failure: bit mismatch!
 ERROR: At byte: 3262520564367...
 ERROR: NOTE: D:2:0\Video_01\TV\filename1 closed before the failure at 393794396405
 ERROR: INFO: healthy file => D:2:0\Video_01\TV\filename1
   a long list of error info messages all detailing apparently healthy files
 ERROR: NOTE: D:3:0\Video_01\filename2 closed before the failure at 1685898914682
 ERROR: INFO: healthy file => D:3:0\Video_01\TV\filename2
 ERROR: NOTE: D:0:0\Video_01\filename3 was open at the time of the failure
 ERROR: INFO: healthy file => D:0:0\Video_01\TV\filename3
 ERROR: NOTE: filename4 was open at the time of the failure
 ERROR: INFO: healthy file => D:1:0\Video_01\TV\filename4
 ERROR: NOTE: P:0:0\003039.flxr was open at the time of the failure
 ERROR: INFO: healthy file => P:0:0\003039.flxr
 ERROR: Buffer size: 1048576 - Number of failed bytes in buffer: 1230
 ERROR: First failure at position: 3262520564367 - Last failure at position: 3262520565597
 INFO : Verify failure: operation aborted!

How do I tell what file(s) is at location 3262520564367 to 3262520565597, as advised by the ERROR: At byte message? It's not even evident to which of my four DRUs it applies to.

I don't believe in co-incidences, but this is my first Verify since upgrading to RAID-F ver 2.1 2015-11-01.

Finally, no problems according to SMART monitoring.

Appreciate any help...

5
I've been using FlexRAID for over three years; earlier this week, I experienced a disk failure and I'm delighted to be able to say that I was able to effect a full recovery. Not a single byte was lost. A big thank you to Brahim.
(I've had disk failures before; but this was the first where I was unable to copy the data off the failing drive and where I needed to instigate a RAID recovery.)

My configuration: 4x 3TB DRU and 1x 3TB PPU. I run RAID-F in Expert mode and have my array configured Storage Pool.

I felt that my experience may be of interest to others here on the forum.

My server had been operating fine for a few hours when suddenly file access to my storage pool hung. Initially, I thought this was a network problem. But I was soon able to establish, by logging into my server and using FlexRAID's Drive Testing and S.M.A.R.T. menus that one of my DRUs had failed. My last update was the night before; since then only a couple of gigabytes had been created, a mere kilobyte or two delete, and some renames. So I was optimistic for a decent recovery.

I rushed out a bought a new 3TB disk, installed it replacing the failed unit, initialized and formatted it.

I then ran FlexRAID Restore. It took a nail biting 10 hours to complete.

My first next step was to examine the logs. I had a number of information messages (INFO) about displaced files (because of the renames I'd performed) and two warnings (WARN messages). The first WARN was for the deleted file; the other WARN was to say that one restored file was corrupt. This would have been because the deleted file was needed as part of the parity reconstruct for the restored corrupt file.

Now because I've long had a fear of bit rot, I always create a 1% PAR2 file for each sub-directory. So I was able to run QuickPar and repair the corrupt restored file.

I decided to perform a Quick Validate. This produced a number of WARN messages for deleted and new files - almost all paired because of the renames. But there were four of more significance. One WARN was for the corrupt restored file which had been subsequently repaired - it's Date Modified had changed, so this was good. The other three WARNs were for three files which for some reason FlexRAID had not set the Date Modified; these files all checked out when I ran QuickPar.

Next, I ran Update to bring my array up-to-date. This took only a minute or so, and completed successfully.

I noticed that FlexRAID restore only set the Date Modified field for restored files (well 99.99% of the time), but NOT for the restored directory names.

Closing thoughts/tips...

1. Creating PAR2 files added an extra level of protection for my data. It helped ensure that nothing was lost.
Tip #1. Consider creating PAR2 files.

2. It's a shame that RAID-F doesn't set the Date Modified for restored directories. But I was able to overcome this limitation using a shareware program called FolderTimeUpdate, which set the folder's Date Modified to the time of the latest file within.

3. And what about that 2GB file that was copied to my server just before it failed? No problem, the original was still sitting on my desktop. I simply copied it over again.
Tip #2. Keep originals until after you run Update.

6
Snapshot RAID / Where do I find FlexRAID Disk Manager
« on: December 10, 2013, 11:54:08 am »
I've decided to replace a non-failed 2TB drive (DRU) with a new 3TB unit, by following the instructions in the wiki post "Replacing/Swapping out a non-failed drive for another drive" (http://wiki.flexraid.com/2013/02/13/replacing-swapping-out-a-non-failed-drive-for-another-drive/)
Step 4 states: Go to the FlexRAID Disk Manager and do a swap-only operation. But I can't find the FlexRAID Disk Manager anywhere on FlexRAID's web client...!? I was expecting to find it under System Control Panel, System Toolbox. But no luck there.

Can someone point me in the right direction please. Thanks

And just in case this is relevant, my Snapshot RAID-F configuration was created using Expert mode.

7
For information, bit of a confusing gotcha this...

Just upgraded my OS from WinXP to Win7 and so had to use the consumer portal to transfer one of my licenses even though my hardware hadn't changed. As per wiki, I generated a new pre-registration file, cut-and-paste the contents into the transfer license option, which then (appeared) to error with the dialogue box "Operation failure" and OK button.

Now the phrase Operation Failure led me to believe that the creation of a new "FlexRAID.key.txt" file had failed. So I clicked OK, created another pre-registration file and tried again. And again. And yet again; four times in all. In fact each had worked; when I checked my email I had four new "FlexRAID.key.txt" files!!! Bit of a time waster. How else is one supposed to interpret "Operation Failure" and OK???

Just in case this catches out someone else, I thought I'd share my experience.

 :-\

8
Storage Pooling / Storage Pool Server Shares not working (for me...!)
« on: August 06, 2012, 06:15:21 am »
Now I must be doing something stupid; I'm missing something simple... but I just can't see what.

I have FlexRAID 2.0 Final Update 9 installed on my Windows XP test server. Using cruise control, I have defined a snapshot RAID with storage pooling: 2x DRU + 1x PPU. All is working except that when I use FlexRAID's client to setup a share for the pooled drive, I cannot see it across my network. I have followed the wiki precisely, http://wiki.flexraid.com/2011/12/03/accessing-your-storage-pool-over-a-network/

Note that I can share and access the pooled drive if I set it manually using WinXP Explorer (right-click pooled drive letter etc.). So it's not a network problem (I presume). Setting the share via WinXP is unacceptable as it is lost on reboot; which is why FlexRAID has a client option to setup server shares.

Things that I have tried include:
- Under Preferences and Settings, Run-Time Properties, Advanced: change Removable to false
- Under Preferences and Settings, Storage Pool Merge Configuration: using a different drive letter
- Restarting FlexRAID
- Deleting my config and starting again.

The only thing that I can see that looks odd is under Drive Manager. There is an entry for Drive State: Unsupported [Other] shares@network[] MISSING. Is this relevant?

I have attached a couple of screenshots of my configuration.

Any help/advice appreciated. thx

9
Snapshot RAID / Exclusions not working properly
« on: March 19, 2012, 08:48:12 am »
Hi,

My Snapshot RAID (T1+ 3xDRU & 1xPPU, running on WinXP) has started to throw up problems resulting from an (apparent) failure to exclude files in the Exclusion list. The problem started about two weeks ago while I was still on v2.0 beta 14, but it is still happening now with v2.0 Final. My Exclusions are as per the default; I've never changed them, viz:

*\RECYCLED;*\RECYCLER\S-*;*\$Recycle.bin;*\System Volume Information

The problem manifests itself in at least two ways. For example, a Quick-Validate after an Update issues the following warnings; example from my log file:
[2012-03-19 06:21:12,484] INFO : Executing quick-validate...
[2012-03-19 06:21:12,484] INFO : [quick-validate] started at: Mon Mar 19 06:21:12 GMT 2012
[2012-03-19 06:21:16,640] INFO : Using RAID engine T1+
[2012-03-19 06:21:19,078] WARN : Changed: M:\System Volume Information\EfaData\SYMEFA.DB
[2012-03-19 06:21:30,500] WARN : Changed: M:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\desktop.ini
[2012-03-19 06:21:30,953] WARN : Changed: M:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\INFO2
[2012-03-19 06:21:53,656] WARN : Changed: N:\System Volume Information\EfaData\SYMEFA.DB
[2012-03-19 06:21:53,890] WARN : Changed: N:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\desktop.ini
[2012-03-19 06:21:53,937] WARN : Changed: N:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\INFO2
[2012-03-19 06:22:00,359] WARN : Changed: L:\System Volume Information\EfaData\SYMEFA.DB
[2012-03-19 06:22:00,515] WARN : Changed: L:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\desktop.ini
[2012-03-19 06:22:00,531] WARN : Changed: L:\RECYCLER\S-1-5-21-1801674531-1336601894-1644491937-1003\INFO2
[2012-03-19 06:22:00,562] INFO : [quick-validate] completed at: Mon Mar 19 06:22:00 GMT 2012

Another manifestation of the problem is with Verify. The Verify aborts around 15% with the message "ERROR: Verify failure: bit mismatch!" and warning messages indicating modified files that should have been excluded, usually \System Volume Information\EfaData\SYMEFA.DB

With the help of Google, I've established that SYMEFA.DB belongs to Symantec Endpoint Protection. So it is quite conceivable that this file is constantly changing as I run with real-time anti-virus protection. But surely, this should be irrelevant as FlexRAID should be ignoring files that match an Exclusion Pattern.

Any ideas/suggestions/advice? I don't want to turn off my AV.

Thanks

10
Hi,

I've just upgraded from FlexRAID-2.0-PreviewXII to the new (ink still wet!) FlexRAID-2.0-Public-Beta1.

And I've successfully built and validated my previous T1+ 3xDRU + 1xPPU configuration.

But I can no longer run the web client using Google Chrome; I'm having to use Microsoft IE instead. It looks as though the web client now uses Active X which I know is not supported by Chrome. Is my diagnosis correct?

Chrome displays the following error message:
com.google.gwt.user.client.rpc.StatusCodeException
500 The call failed on the server; see server log for details

And my logs show:
[2011-11-19 13:44:31,468] INFO : Started SocketConnector@0.0.0.0:8080
[2011-11-19 13:45:05,812] INFO : com.tchegbe.flexraid.main.a.w-53057384: ERROR: The serialization policy file '/com.tchegbe.flexraid.UI/D0F160EF10051FFA0B7A2BD194222763.gwt.rpc' was not found; did you forget to include it in this deployment?
[2011-11-19 13:45:05,812] INFO : com.tchegbe.flexraid.main.a.w-53057384: WARNING: Failed to get the SerializationPolicy 'D0F160EF10051FFA0B7A2BD194222763' for module 'http://localhost:8080/com.tchegbe.flexraid.UI/'; a legacy, 1.3.3 compatible, serialization policy will be used.  You may experience SerializationExceptions as a result.
[2011-11-19 13:45:05,906] WARN : Exception while dispatching incoming RPC call
com.google.gwt.user.client.rpc.SerializationException: Type 'com.tchegbe.flexraid.client.model.GlobalInfoModel' was not assignable to 'com.google.gwt.user.client.rpc.IsSerializable' and did not have a custom field serializer.For security purposes, this type will not be serialized.: instance = com.tchegbe.flexraid.client.model.GlobalInfoModel@320a2b8
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)

The above is all gibberish to me; and I can see the word java, but java is enabled in my Chrome, so I don't think this is the problem.

Any suggestions much appreciated.

And a big thank you to Brahim, NLS, etc for all the work they've put in.

Tony

Pages: [1]