Author Topic: verify sync failes shortly after starting fue to "disk error"  (Read 3623 times)

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Situation:
had a failed disk.
unfailed, checked data, was fine
ran verify sync on array
a couple minutes in it aborts due to many red error messages like:
error code 1073741668
operation aborted due to disk error!
failed UoR position = 5
failed uor id = xxxxxxxxxxx

not sure how i should proceed.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #1 on: April 09, 2017, 04:09:07 pm »
windows event says the dropped disk has bad blocks
chkdsk says there are no problems
checked data on the drive, and it looked good
not sure how to get the array going, seems like things are ok, but the verify tasks give errors

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #2 on: April 10, 2017, 10:49:44 am »
Could be mechanical issues that pop up when the disk is under load. Check SMART.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #3 on: April 10, 2017, 01:33:29 pm »
maybe i should replace it.  i try to run a disk test on it and that fails to run a few seconds in as well.  but the smart data is saying the disk is fine.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #4 on: April 11, 2017, 04:08:45 pm »
That would indeed be the safe thing to do. Another possibility could be that the controller port it is on is bad.
So, if the replacement disk has the same issue on the same port, then you have a new lead. Otherwise, you will know that it was indeed the disk.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #5 on: April 11, 2017, 04:43:50 pm »
That would indeed be the safe thing to do. Another possibility could be that the controller port it is on is bad.
So, if the replacement disk has the same issue on the same port, then you have a new lead. Otherwise, you will know that it was indeed the disk.
ok i replaced the disk, and a new array was created.
i am currently running a verify sync, with the same configuration settings as before.
noticed the speed of the throughput is now about 1/3 what it used to be.  from 600 down to about 200 Mbps
the verify sync is proceeding without errors, unlike previously.
however, windows event is logging errors still, but not the "bad block" error as before
now it is a bunch of LSI SAS2 controller errors event id 11
and also a bunch of warnings for the replaced disk, event id 153, saying the IO operation at logical block address was retried.

I looked these up online.  Most of the advice is to run chkdsk and smart tests.  both show no problems.
there was another traid user on one of the other forums with a similar error.

so things are working now, but i am concerned why the speed is slower and these errors didn't appear until i did a big windows update.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #6 on: April 11, 2017, 06:51:12 pm »
update: the verify sync again aborted.
the replaced disk gave another error, maybe there is something wrong with the cable for that drive.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #7 on: April 11, 2017, 09:32:53 pm »
i think the latest windows update changed something that is causing these problems.  there has already been news that the feb update affected smb, and there definitely was a big change made a few months ago regarding elevated permissions and such.
I changed the cables around to see if it's my cable on that one disk.  i don't think it is, because now i am getting errors with 4 other disks.  so i think it's a windows problem somewhere.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #8 on: April 11, 2017, 10:11:59 pm »
odd...
the original drive with the problem, i replaced.
then the new drive had similar problems.
so i switched the cable with a drive next to it.
the drive with the problem cable now started to have a problem.
so i figure the cable is the issue.
so i changed the cable (breakout) entirely.
now, the same drive as before the cable change has problems!
how could that be? i get a bunch of event id 153 errors when i try to run a verify sync.  but at least it's not aborting, and the speed is back to full speed.  so hopefully it will finish with the verify sync, and i can figure out what the problem is afterwards.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #9 on: April 12, 2017, 07:25:21 am »
Check make sure your PSU is adequate. Then, try updating your controller driver. If all that fails, you might want to try a new controller.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #10 on: April 12, 2017, 01:53:58 pm »
thanks.

i changed the cable, and the verify sync completed successfully.  SO it must have been a bad cable.  The test completed with the expected speed after using a new cable, so the speed problem is gone.  I haven't gotten any windows errors since the verify completed.  i believe there are no more issues for the time being.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #11 on: April 12, 2017, 07:49:03 pm »
so here's what i have learned, somewhat related to the issue here...
i have now noticed that i usually run into issues with traid when i reach (and pass) the limit in the settings where it says to prevent writing or something when only a certain percentage of storage is left.  That's when i get the access denied and such messages.  however, depending on a bunch of stuff i think, errors may arise when copying/moving files either when the limit is reached or afterwards.  obviously, traid is not supposed to keep writing after that. 

at first, i wasn't paying attention and may have written data that got lost.  upon rebooting, things go back to normal.  not sure it's related, but i got the issue in this thread right at the point of reaching max.  i need to remember to avoid doing file operations when i'm close to the limit.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #12 on: April 12, 2017, 10:27:59 pm »
If you are referring to the pool, you need to set a more appropriate reserve space value. Not all program can handle being paused while the pool move the partial data on a full disk to the next. It is not the pool failing, it is the client program throwing an error after a timeout from being paused.

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #13 on: April 14, 2017, 01:07:12 pm »
If you are referring to the pool, you need to set a more appropriate reserve space value. Not all program can handle being paused while the pool move the partial data on a full disk to the next. It is not the pool failing, it is the client program throwing an error after a timeout from being paused.
yes, that is exactly what it is, thanks.  so you're saying if the limit is higher, there is less of a chance that the clients will have issues?

Offline pooler1

  • Jr. Member
  • **
  • Posts: 78
  • Karma: +0/-0
    • View Profile
Re: verify sync failes shortly after starting fue to "disk error"
« Reply #14 on: April 14, 2017, 05:55:57 pm »
i am still getting some errors in windows events when i do copy/move with traid.
error 153. the IO operation at logical block address ... disk11 was retried.

it doesn't seem serious.  but its always for that disk that i just replaced, and i replaced the cable, too.  it's the only disk giving errors, but i can't figure out why always the same disk.