Author Topic: Resume failed create works but interface doesn't update  (Read 1601 times)

Offline Mondo

  • Jr. Member
  • **
  • Posts: 53
  • Karma: +0/-0
    • View Profile
Resume failed create works but interface doesn't update
« on: October 24, 2011, 09:45:32 pm »
First my setup is a Norco 4224 box, Corsair TX650, Supermicro X9SCM-F Motherboard, 2 x Supermicro AOC-SASLP-MV8 cards, 4 GiB RAM and core i3 processor and a bunch of 2TB hard drives - a mixture of Seagate and WD green drives.

I'm using Snapshot RAID with the TX infinity engine and I recently reconfigured everything and started from scratch so that I could mount the drives as per the slots in the Norco to make them easier to identify when a drive fails as the SASLP-MV8 cards have no activity LED interface etc plus the backplane on the Norco is fairly basic when it comes to driving LEDs. It's a budget server and while it wasn't simple to work out what drives belonged to what port and what port belonged to what slot a little trial and error and I now have a handle on it, so that each new drive I add will be to a known slot and controller port.

I'm running WHS2011 and everything worked fine the first time around but this time, starting from scratch I threw in an extra parity drive and a couple of DRUs so I now have 7DRUs and 2 PPUs all 2TB in size.  About 60% of the way through the initial parity build I got an unexpected EOF error and memory leaks. So I "resumed failed create" and that worked perfectly, BUT on completion the FlexRAID interface didn't recognise that it was complete so the interface has not updated.

I tried rebooting and "resume failed create" again, even though the create hadn't failed and it said no data change which seems to indicate that parity is actually correct.

Is there any way to force the FlexRAID GUI to update or do i have to re-create the parity data from scratch...? I think the reason the first create failed could have been that pagefile.sys is not automatically excluded fom the flexRAID or it could be that I forgot to exclude the C drive initially from my storage pool.

Here's the log of the last boot...

[2011-10-25 11:34:34,518] INFO : Starting FlexRAID...
[2011-10-25 11:34:34,518] INFO : FlexRAID 2.0 Preview XII [Snapshot 1.4 stable / Real-Time 1.0 alpha / Storage Pool 1.0 beta]
[2011-10-25 11:34:34,534] INFO : Started: 25 October 2011 11:34:34 AM
[2011-10-25 11:34:37,202] INFO : Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl
[2011-10-25 11:34:37,202] INFO : Quartz Scheduler v.1.8.4 created.
[2011-10-25 11:34:37,202] INFO : Registering Quartz shutdown hook.
[2011-10-25 11:34:37,202] INFO : Using thread monitor-based data access locking (synchronization).
[2011-10-25 11:34:38,153] INFO : Removed 0 Volatile Trigger(s).
[2011-10-25 11:34:38,153] INFO : Removed 0 Volatile Job(s).
[2011-10-25 11:34:38,153] INFO : JobStoreTX initialized.
[2011-10-25 11:34:38,153] INFO : Scheduler meta-data: Quartz Scheduler (v1.8.4) 'flexraid' with instanceId '1'
  Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally.
  NOT STARTED.
  Currently in standby mode.
  Number of jobs executed: 0
  Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 3 threads.
  Using job-store 'org.quartz.impl.jdbcjobstore.JobStoreTX' - which supports persistence. and is not clustered.

[2011-10-25 11:34:38,153] INFO : Quartz scheduler 'flexraid' initialized from default resource file in Quartz package: 'quartz.properties'
[2011-10-25 11:34:38,153] INFO : Quartz scheduler version: 1.8.4
[2011-10-25 11:34:38,262] INFO : Freed 0 triggers from 'acquired' / 'blocked' state.
[2011-10-25 11:34:38,278] INFO : Recovering 0 jobs that were in-progress at the time of the last shut-down.
[2011-10-25 11:34:38,278] INFO : Recovery complete.
[2011-10-25 11:34:38,278] INFO : Removed 0 'complete' triggers.
[2011-10-25 11:34:38,278] INFO : Removed 0 stale fired job entries.
[2011-10-25 11:34:38,325] INFO : Scheduler flexraid_$_1 started.
[2011-10-25 11:34:38,524] INFO : jetty-7.x.y-SNAPSHOT
[2011-10-25 11:34:41,879] INFO : Started SocketConnector@0.0.0.0:8080
[2011-10-25 11:35:12,176] INFO : Initiating FlexRAID Storage Pool...
[2011-10-25 11:35:12,195] INFO : Host OS = Microsoft Windows [Version 6.1.7601]
[2011-10-25 11:35:12,195] INFO : Building View Registry...
[2011-10-25 11:35:12,196] INFO : Building Auto View Registry...
[2011-10-25 11:35:12,201] INFO : Loading View options...
[2011-10-25 11:35:12,205] INFO : Using Auto-View definition...
[2011-10-25 11:35:12,205] INFO : Building View Handler...
[2011-10-25 11:35:12,205] INFO : Mounting Storage Pool drive...
[2011-10-25 11:35:29,834] INFO : Reconnecting FlexRAID=Z:\ /GRANT:MONDOWHS\Administrator,FULL /GRANT:Everyone,READ

And here's the log where the initial parity build failed  :(...
[2011-10-21 18:35:37,256] INFO : *** Starting workers... ***
[2011-10-22 00:00:00,431] INFO : Trigger update.FlexRAID[update][Day] fired job FlexRAID.FlexRAID[update] at:  00:00:00 10/22/2011
[2011-10-22 00:00:00,790] INFO : Trigger update.FlexRAID[update][Day] completed firing job FlexRAID.FlexRAID[update] at  00:00:00 10/22/2011 with resulting trigger instruction code: DO NOTHING
[2011-10-22 10:09:38,794] ERROR: Invalid state - Unexpected EOF!
java.io.IOException: Invalid state - Unexpected EOF!
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
[2011-10-22 10:09:38,809] ERROR: Invalid state - reading=false removing=false writing=false needsWriting=false computing=false needsComputing=true
java.io.IOException: Invalid state - reading=false removing=false writing=false needsWriting=false computing=false needsComputing=true
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
   at java.lang.Void.<unknown>(Unknown Source)
[2011-10-22 10:09:38,950] WARN : Operation is being aborted... re-validating last processed entries...
[2011-10-22 10:09:53,286] INFO : [create] completed at: Sat Oct 22 10:09:53 EST 2011
[2011-10-22 10:09:53,318] WARN : Memory leak detected on native buffers! Leaks count = 30 out of 95
[2011-10-24 13:30:27,903] INFO : Scheduler flexraid_$_1 shutting down.
[2011-10-24 13:30:27,903] INFO : Scheduler flexraid_$_1 paused.
[2011-10-24 13:30:27,903] INFO : Scheduler flexraid_$_1 shutdown complete.


thanks

John




Norco RPC-4224 4U Rackmount Case (24 Hotswap Bays) | Supermicro X9SCM-F Motherboard (four PCI-E 8x slots @ 8x/8x/4x/4x)|4Gi RAM  | Core i3-2100 |2 x Supermicro AOC-SASLP-MV8 SAS Controller | Corsair TX650v2 PSU | Currently 12 TB array, 2 TB parity | WHS 2011

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: Resume failed create works but interface doesn't update
« Reply #1 on: October 25, 2011, 11:08:14 am »
I think you will do best starting from scratch.

Offline Mondo

  • Jr. Member
  • **
  • Posts: 53
  • Karma: +0/-0
    • View Profile
Re: Resume failed create works but interface doesn't update
« Reply #2 on: October 25, 2011, 07:04:19 pm »
I think you will do best starting from scratch.


OK - was trying to avoid that but no problems. I'l just re-format the parity drives and start again. Does this mean that a "resume failed create" is a bad idea...
Norco RPC-4224 4U Rackmount Case (24 Hotswap Bays) | Supermicro X9SCM-F Motherboard (four PCI-E 8x slots @ 8x/8x/4x/4x)|4Gi RAM  | Core i3-2100 |2 x Supermicro AOC-SASLP-MV8 SAS Controller | Corsair TX650v2 PSU | Currently 12 TB array, 2 TB parity | WHS 2011