Author Topic: How to get Notification alert "Missing Drive" in pool and prevent update tasks?  (Read 744 times)

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,270
  • Karma: +199/-15
    • View Profile
The SMART is used for disk health monitoring. It will notify you if issues are found with the monitored disks.

Offline TheJSN

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
But what parameter to monitor to see if the disk is online or offline.  Under Smart Monitoring, does it use the standard notification email set in flexraid?  I don't see any option for if the drive is offline or online, only error counts, temperature etc.  The whole point of knowing a disk is dead or offline, is to ensure that the update won't run and the disk can be replaced/recovered immediately.  But as of right now I have to manually go to the page, check the disks in the pool to see if it's missing.  Shouldn't we be able to get an alert if a disk has failed?

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,270
  • Karma: +199/-15
    • View Profile
The Update task detects if a disk has failed or not and aborts in such a case.

Read this carefully: http://wiki.flexraid.com/2013/06/27/transparent-raid-web-ui-system-preferences/
When setting up email and SMS, you have to option on the types of notifications.

Offline TheJSN

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
I followed a recommendation on this forum to run both a validate and update as part of the scheduled maintenance of snapshot raid.  Is it possible its the validate step that is having the error?  I sent you a copy of my log file from my last failure that clearly showed every file on the "missing/offline" disk marked as DELETED followed by a successful update.  Since i didn't discover the disk was offline until I noticed files missing, when i went back to do the recovery, it recovered nothing. A blank disk.  It seems something in the maintenance schedule just deleted every file from the parity.  I assume my disk died sometime before this operation on Jan 19th at 2:00am.

[2017-01-19 02:00:00,481] INFO : [update] started at: Thu Jan 19 02:00:00 EST 2017
[2017-01-19 02:00:26,710] INFO : Using RAID engine Tx
[2017-01-19 02:08:00,113] INFO : Total process size = 381823221760
[2017-01-19 02:08:00,113] INFO : Max data size = 3895049846784
[2017-01-19 02:08:00,113] INFO : DRU1 size=3742662222994
[2017-01-19 02:08:00,113] INFO : DRU2 size=2923890489322
[2017-01-19 02:08:00,113] INFO : DRU3 size=2978383160150
[2017-01-19 02:08:00,113] INFO : DRU4 size=2976806162556
[2017-01-19 02:08:00,113] INFO : DRU5 size=2932063522204
[2017-01-19 02:08:00,113] INFO : DRU6 size=3387872483082
[2017-01-19 02:08:00,113] INFO : DRU7 size=3868260275375
[2017-01-19 02:08:00,113] INFO : DRU8 size=3772404523091
[2017-01-19 02:08:00,113] INFO : DRU9 size=3895048906964
[2017-01-19 02:08:00,113] INFO : DRU10 size=3115721260535
[2017-01-19 02:08:00,113] INFO : DRU11 size=2722723491891
[2017-01-19 03:21:14,290] WARN : D:5:1\youtubedownload\Ming Xi on Instagram “💗💗💗#TrainLikeAnAngel #VSFashionShow.mp4 no longer exist! Skipping...
[2017-01-19 03:24:19,157] ERROR: Invalid state - Unexpected EOF on D:6:0\MP3\2008\Melissa\Nic_Chagall_-_What_You_Need__Marco_V_Remix_.mp3
java.io.IOException: Invalid state - Unexpected EOF on D:6:0\MP3\2008\Melissa\Nic_Chagall_-_What_You_Need__Marco_V_Remix_.mp3
   at com.tchegbe.lib.common.io.d.a(Unknown Source)
   at com.tchegbe.lib.common.io.d.a(Unknown Source)
   at com.tchegbe.lib.common.io.i.<unknown>(Unknown Source)
   at com.tchegbe.lib.common.io.parity.e.a(Unknown Source)
   at com.tchegbe.lib.common.io.e.f.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
[2017-01-19 03:24:19,214] WARN : Incomplete - D:6:0\MP3\2008\Melissa\Nic_Chagall_-_What_You_Need__Marco_V_Remix_.mp3 - size:18690732 read:9191304
[2017-01-19 03:24:19,900] INFO : [update] completed at: Thu Jan 19 03:24:19 EST 2017
[2017-01-19 03:24:23,864] INFO : Sending email alert - message=[SNAP-RAID1][UPDATE] FAILURE
[2017-01-19 03:24:32,702] INFO : [SNAP-RAID1][UPDATE] FAILURE - emailed...
[2017-01-19 05:00:00,213] INFO : Executing validate...
[2017-01-19 05:00:00,214] INFO : [validate] started at: Thu Jan 19 05:00:00 EST 2017
[2017-01-19 05:00:22,356] INFO : Using RAID engine Tx
[2017-01-19 05:14:01,039] WARN : Deleted: D:6:0\HDTV1\Drama\Pearl Harbor (2001).mkv

************** Then it proceeds to Delete every file on the missing disk  1000's of Deletions it finishes many hours later********

[2017-01-19 19:49:47,221] WARN : Changed: P:0:0\003097.flxr
[2017-01-19 19:49:56,542] WARN : Changed: P:0:0\003098.flxr

************** There are a number of these Warn Changed Parity items in the log also after all of the Deletions ***********

[2017-01-20 12:34:29,683] WARN : New file found: C:\FlexRAID-Managed-Pool\class1_0\{beb8cf64-6dca-11e2-93f6-0019d17e6b63}\MP3\Tiger & Kitty Music\WaltzofThrones-v2.mp3
[2017-01-20 12:34:33,825] INFO : [validate] completed at: Fri Jan 20 12:34:33 EST 2017
[2017-01-20 12:34:35,250] INFO : Sending email alert - message=[SNAP-RAID1][VALIDATE] SUCCESS
[2017-01-20 12:34:41,327] INFO : [SNAP-RAID1][VALIDATE] SUCCESS - emailed...
[2017-01-20 23:19:38,337] INFO : Service stopping...
[2017-01-20 23:19:38,337] INFO : Stopping views...
[2017-01-20 23:19:38,891] INFO : Unmounting View...
[2017-01-20 23:19:46,676] INFO : Unmounting View complete...
[2017-01-20 23:19:47,839] INFO : Stopping scheduler...
[2017-01-20 23:19:48,019] INFO : Stopping web server...
[2017-01-20 23:19:48,222] INFO : Service stopping...
[2017-01-20 23:19:48,222] INFO : Stopping views...
[2017-01-20 23:19:48,222] INFO : Stopping scheduler...
[2017-01-20 23:19:48,222] INFO : Stopping web server...
[2017-01-20 23:19:48,353] INFO : Stopping socket server...
[2017-01-20 23:19:48,353] INFO : Stopping socket server...
[2017-01-20 23:19:48,665] INFO : Shutting down at: 1/20/17 11:19 PM
[2017-01-20 23:19:48,665] INFO : Closing all client connections...
[2017-01-20 23:19:48,665] INFO : Shutting down at: 1/20/17 11:19 PM
[2017-01-20 23:19:48,665] INFO : Closing all client connections...
[2017-01-20 23:19:48,681] INFO : Exiting all processes...
[2017-01-20 23:19:48,915] ERROR: Connection reset by peer: socket write error
java.net.SocketException: Connection reset by peer: socket write error
   at java.net.SocketOutputStream.socketWrite0(Native Method)
   at java.net.SocketOutputStream.socketWrite(Unknown Source)
   at java.net.SocketOutputStream.write(Unknown Source)
   at sun.nio.cs.StreamEncoder.writeBytes(Unknown Source)
   at sun.nio.cs.StreamEncoder.implFlushBuffer(Unknown Source)
   at sun.nio.cs.StreamEncoder.implFlush(Unknown Source)
   at sun.nio.cs.StreamEncoder.flush(Unknown Source)
   at java.io.OutputStreamWriter.flush(Unknown Source)
   at java.io.BufferedWriter.<unknown>(Unknown Source)
   at com.tchegbe.lib.b.a.b.a(Unknown Source)
   at com.tchegbe.lib.b.a.c.run(Unknown Source)
[2017-01-20 23:19:49,025] INFO : Exiting all processes...
[2017-01-20 23:19:50,244] ERROR: Connection refused: connect
java.net.ConnectException: Connection refused: connect
   at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
   at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
   at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
   at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
   at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
   at java.net.PlainSocketImpl.connect(Unknown Source)
   at java.net.SocksSocketImpl.connect(Unknown Source)
   at java.net.Socket.connect(Unknown Source)
   at com.tchegbe.lib.b.a.b.a(Unknown Source)
   at com.tchegbe.lib.b.a.e.a(Unknown Source)
   at com.tchegbe.lib.b.a.g.a(Unknown Source)
   at com.tchegbe.lib.common.i.d.b(Unknown Source)
   at com.tchegbe.lib.common.i.a.run(Unknown Source)
[2017-01-20 23:19:50,837] ERROR: Connection refused: connect
java.net.ConnectException: Connection refused: connect
   at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
   at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
   at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
   at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
   at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
   at java.net.PlainSocketImpl.connect(Unknown Source)
   at java.net.SocksSocketImpl.connect(Unknown Source)
   at java.net.Socket.connect(Unknown Source)
   at com.tchegbe.lib.b.a.b.a(Unknown Source)
   at com.tchegbe.lib.b.a.e.a(Unknown Source)
   at com.tchegbe.lib.b.a.g.a(Unknown Source)
   at com.tchegbe.lib.common.i.d.a(Unknown Source)
   at com.tchegbe.lib.common.i.a.<unknown>(Unknown Source)
   at com.tchegbe.lib.common.i.a.<unknown>(Unknown Source)
   at com.tchegbe.lib.common.i.a.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.a.h.run(Unknown Source)
[2017-01-20 23:22:40,628] INFO : Setting core service port to 9595
[2017-01-20 23:22:40,659] INFO : Setting Http port to 8080
[2017-01-20 23:22:40,659] INFO : Https port not set. Https requests will not be serviced...
[2017-01-20 23:22:40,768] INFO : Starting FlexRAID...
[2017-01-20 23:22:40,784] INFO : FlexRAID 2.1 [Snapshot 1.1 / Storage Pool 2.0] [2015.11.01]
[2017-01-20 23:22:40,784] INFO : Started: January 20, 2017 11:22:40 PM EST
[2017-01-20 23:23:00,400] INFO : Initiating FlexRAID Storage Pool...
[2017-01-20 23:23:00,924] INFO : Host OS = Microsoft Windows [Version 6.3.9600]
[2017-01-20 23:23:00,924] INFO : Building View Registry...
[2017-01-20 23:23:00,991] INFO : Building Auto View Registry...
[2017-01-20 23:23:01,007] INFO : Loading View options...
[2017-01-20 23:23:01,409] INFO : Using Auto-View definition...
[2017-01-20 23:23:01,409] INFO : Building View Handler...
[2017-01-20 23:23:01,731] INFO : Mounting Storage Pool drive...
[2017-01-20 23:23:23,156] INFO : Reconnecting d$="D:" /GRANT:kevin,FULL
[2017-01-21 00:16:41,908] INFO : Executing update...
[2017-01-21 00:16:41,909] INFO : [update] started at: Sat Jan 21 00:16:41 EST 2017
[2017-01-21 00:16:54,850] INFO : Using RAID engine Tx
[2017-01-21 00:21:34,910] INFO : Total process size = 440686084096
[2017-01-21 00:21:34,910] INFO : Max data size = 3905625784320
[2017-01-21 00:21:34,910] INFO : DRU1 size=3742662222994
[2017-01-21 00:21:34,910] INFO : DRU2 size=2923890489322
[2017-01-21 00:21:34,910] INFO : DRU3 size=2978383160150
[2017-01-21 00:21:34,910] INFO : DRU4 size=2976806162556
[2017-01-21 00:21:34,910] INFO : DRU5 size=2932063522204
[2017-01-21 00:21:34,910] INFO : DRU6 size=3387872483082
[2017-01-21 00:21:34,910] INFO : DRU7 size=3905625465549
[2017-01-21 00:21:34,910] INFO : DRU8 size=3772404523091
[2017-01-21 00:21:34,910] INFO : DRU9 size=3895048906964
[2017-01-21 00:21:34,910] INFO : DRU10 size=3115721260535
[2017-01-21 00:21:34,910] INFO : DRU11 size=2722723491891
[2017-01-21 00:43:16,127] WARN : process has not exited
java.lang.IllegalThreadStateException: process has not exited
   at java.lang.ProcessImpl.<unknown>(Unknown Source)
   at java.lang.ProcessImpl.waitFor(Unknown Source)
   at com.tchegbe.a.a.c.a(Unknown Source)
   at com.tchegbe.a.a.c.<unknown>(Unknown Source)
   at com.tchegbe.a.a.c.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.view.k.s.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.view.k.s.<unknown>(Unknown Source)
   at com.tchegbe.lib.io.view.k.s.<unknown>(Unknown Source)
   at com.tchegbe.flexraid.main.b.A.getFlags(Unknown Source)
   at com.google.gwt.user.server.rpc.RPC.<unknown>(Unknown Source)
   at com.google.gwt.user.server.rpc.RemoteServiceServlet.processCall(Unknown Source)
   at com.google.gwt.user.server.rpc.RemoteServiceServlet.processPost(Unknown Source)
   at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(Unknown Source)
   at javax.servlet.http.HttpServlet.service(Unknown Source)
   at javax.servlet.http.HttpServlet.service(Unknown Source)
   at org.eclipse.jetty.servlet.ServletHolder.handle(Unknown Source)
   at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Unknown Source)
   at com.tchegbe.lib.gwt.server.servlet.SecurityFilter.doFilter(Unknown Source)
   at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Unknown Source)
   at org.eclipse.jetty.servlet.ServletHandler.doHandle(Unknown Source)
   at org.eclipse.jetty.server.session.SessionHandler.doHandle(Unknown Source)
   at org.eclipse.jetty.server.handler.ContextHandler.doHandle(Unknown Source)
   at org.eclipse.jetty.servlet.ServletHandler.doScope(Unknown Source)
   at org.eclipse.jetty.server.session.SessionHandler.doScope(Unknown Source)
   at org.eclipse.jetty.server.handler.ContextHandler.doScope(Unknown Source)
   at org.eclipse.jetty.server.handler.ScopedHandler.handle(Unknown Source)
   at org.eclipse.jetty.server.handler.HandlerList.handle(Unknown Source)
   at org.eclipse.jetty.server.handler.HandlerWrapper.handle(Unknown Source)
   at org.eclipse.jetty.server.Server.handle(Unknown Source)
   at org.eclipse.jetty.server.HttpConnection.handleRequest(Unknown Source)
   at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(Unknown Source)
   at org.eclipse.jetty.http.HttpParser.parseNext(Unknown Source)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(Unknown Source)
   at org.eclipse.jetty.server.BlockingHttpConnection.handle(Unknown Source)
   at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(Unknown Source)
   at org.eclipse.jetty.util.thread.QueuedThreadPool.<unknown>(Unknown Source)
   at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(Unknown Source)
   at java.lang.Thread.run(Unknown Source)
[2017-01-21 01:37:03,839] WARN : D:5:1\youtubedownload\Ming Xi on Instagram “💗💗💗#TrainLikeAnAngel #VSFashionShow.mp4 no longer exist! Skipping...
[2017-01-21 06:24:51,273] INFO : Successful UPDATE operation!
[2017-01-21 06:25:19,655] INFO : [update] completed at: Sat Jan 21 06:25:19 EST 2017
[2017-01-21 06:25:20,473] INFO : Sending email alert - message=[SNAP-RAID1][UPDATE] SUCCESS
[2017-01-21 06:25:30,785] INFO : [SNAP-RAID1][UPDATE] SUCCESS - emailed...
[2017-01-23 02:00:00,258] INFO : Executing update...
[2017-01-23 02:00:00,258] INFO : [update] started at: Mon Jan 23 02:00:00 EST 2017
[2017-01-23 02:00:28,908] INFO : Using RAID engine Tx
[2017-01-23 02:08:03,185] INFO : Total process size = 48191504384
[2017-01-23 02:08:03,185] INFO : Max data size = 3905625784320
[2017-01-23 02:08:03,185] INFO : DRU1 size=3742662222994
[2017-01-23 02:08:03,185] INFO : DRU2 size=2923890489322
[2017-01-23 02:08:03,185] INFO : DRU3 size=2978383160150
[2017-01-23 02:08:03,185] INFO : DRU4 size=2976806162556
[2017-01-23 02:08:03,185] INFO : DRU5 size=2932063522204
[2017-01-23 02:08:03,185] INFO : DRU6 size=3387872483082
[2017-01-23 02:08:03,185] INFO : DRU7 size=3905625465549
[2017-01-23 02:08:03,185] INFO : DRU8 size=3796419348022
[2017-01-23 02:08:03,185] INFO : DRU9 size=3895048906964
[2017-01-23 02:08:03,185] INFO : DRU10 size=3115721260535
[2017-01-23 02:08:03,185] INFO : DRU11 size=2722723491891
[2017-01-23 02:19:01,920] WARN : D:5:1\youtubedownload\Ming Xi on Instagram “💗💗💗#TrainLikeAnAngel #VSFashionShow.mp4 no longer exist! Skipping...
[2017-01-23 02:41:02,142] INFO : Successful UPDATE operation!
[2017-01-23 02:41:04,328] INFO : [update] completed at: Mon Jan 23 02:41:04 EST 2017
[2017-01-23 02:41:07,519] INFO : Sending email alert - message=[SNAP-RAID1][UPDATE] SUCCESS
« Last Edit: February 12, 2017, 09:03:55 am by TheJSN »

Offline TheJSN

  • Newbie
  • *
  • Posts: 13
  • Karma: +0/-0
    • View Profile
 Brahim I've sent and posted my logs from the last failure i could not recover from?  Any ideas?  Is it the validate process that is deleting the files?  Does it do the same checks as update for failed drives?