Author Topic: All of a sudden: very slow copy speeds, slow verify sync  (Read 1153 times)

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
All of a sudden: very slow copy speeds, slow verify sync
« on: February 05, 2019, 07:15:27 pm »
I've had traid on a machine running for a while.  I have two pools on the same machine.  All of a sudden, my second pool started being almost too slow to even use (transfers a few bytes per sec or kB per sec).  What led to this is I was moving a lot of GB from that second pool to the first one, so there was a lot of file transfer operations taking place for like a full day.

The read percentage for most of the disks on the second pool are showing like 99%.  And every few hours a lock override count would start.  no more than 10, usually 1.  I clear it, but nothing improves in terms of speed.  Also, clearing the count doesnt make it go to zero.  It still stays the same ("1" right now).

Verify sync has slowed to a crawl, like 1 MBps throughput when it used to be like 300-600 or so.

I attached a screenshot of the RAID monitoring plugin.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #1 on: February 05, 2019, 09:22:17 pm »
some more info:
ive been copying the data off the problem pool's drives.  mounted individually, im having no issues.  WHen I mounted one (with the problem files on it) windows told me the disk has problems.  But HDS tells me the disk is perfect.
Maybe i just need to recreate array?

Offline cogliostrio

  • Jr. Member
  • **
  • Posts: 65
  • Karma: +2/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #2 on: February 06, 2019, 04:50:23 am »
Could it be a controller issue?

Offline Skirge01

  • Full Member
  • ***
  • Posts: 203
  • Karma: +5/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #3 on: February 06, 2019, 10:32:02 am »
When was the last extended self-test performed on the disks (you can see that under Information in HDS)?  You may want to trigger one manually.  Also, do you have HDS's Health Calculation Method set to default or more strict, for servers?  Did you try rebooting the server to see if the issue continues?  There's always the chance that some rogue process is causing a bottleneck.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #4 on: February 06, 2019, 06:38:27 pm »
When was the last extended self-test performed on the disks (you can see that under Information in HDS)?  You may want to trigger one manually.  Also, do you have HDS's Health Calculation Method set to default or more strict, for servers?  Did you try rebooting the server to see if the issue continues?  There's always the chance that some rogue process is causing a bottleneck.
thanks...

ive never run an extended self test.  Ill do that for the drive windows said needs to be fixed.  I did mount all the drives individually and copy the content over, and even with that, for the same problem files on the pool, those same files also transferred slowly mounted as individual.  SO there is an issue with some files, but they transferred much faster than with the pool.  Individually, they transferred between 2-20 MB/s, whereas normally it is anywhere from 50-200 MB/s.  From the pool, these files caused the file transfer to basically slow to a crawl like bytes/s.

HDS setting is strict for servers.
I've rebooted multiple times, it didnt change anything this time.  usually, this solves file transferring problems for me.

From what i can tell, there must a problem with some of these files or something.  I also checked to see the path length of the files, and they are approaching or exceeding the 256 character limit for the path.  That's not usually a problem, but I have had issues with that before.

Here's something else i did that helped while I was troubleshooting, but it wasnt initially part of the problem:
I noticed my landing disk was set to this partition that had less than 500MB available.  I thought I had chosen a partition with at least 200GB (and the landing disk setting for minimum is 50GB).  SO I changed the landing disk to the larger one, and now file transfers are still slow, but tolerable.  LIke instead of slowing to a crawl, I get now 2-10 MB/s for the problem files.

I am currently running a verify sync on the problem pool.  It is also slow, but with the landing disk change above, it is now around 20 MB/s throughput.  Before it was about 2 MB/s.  Hopefully, after this sync, it will be back to 200+ MB/s.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #5 on: February 06, 2019, 06:41:12 pm »
Could it be a controller issue?
It could be.  I have two M1015, one for each pool.  The only difference is that the first M1015 is on a fast x16 PCI slot, and the other one is on a x8 slot.  Although, I think when you do that, the motherboard treats them both as x8.  I dont think its the controller because it worked fine until this one massive file transfer operation.  I want to say its the disk because even individually mounted, those files transferred slowly.

Offline Skirge01

  • Full Member
  • ***
  • Posts: 203
  • Karma: +5/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #6 on: February 07, 2019, 02:32:55 pm »
When you say you tried to "mount all the drives individually" do you mean without tRAID running?  I ask because I was going to suggest disabling the landing disk to see if that alleviates the issue.  But, if they're that slow without any tRAID, then it's certainly not the software causing the issue.  However, if tRAID was still running and the disks were just also assigned letters or otherwise mounted, tRAID could still be the bottleneck.
« Last Edit: February 08, 2019, 01:31:11 pm by Skirge01 »

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #7 on: February 07, 2019, 07:55:43 pm »
When you say you tried to "mount all the drives individually" do you mean without tRAID running?  I ask because I was going to suggest I was going to suggest disabling the landing disk to see if that alleviates the issue.  But, if they're that slow without any tRAID, then it's certainly not the software causing the issue.  However, if tRAID was still running and the disks were just also assigned letters or otherwise mounted, tRAID could still be the bottleneck.
I did it while traid was running.  I dont change any files, I just mount them to see what happens, and I check the files.
I think you are right, there is something about the landing disk.

Here's what I found today.  The files that were causing the problems in the pool, i found the drive they are on, and when I mounted it, that drive required repair (windows notification).  Before I repaired, I copied the files to another drive.  I repaired, all good.
After repairing, the disk is now dropped from the array.  But I checked the files, they look good.  So I unfailed it, and put the pool back up.  I am currently running a verify sync, it is still slow about 20MB/s.  But Ive seen this before, it does it initially then things are ok i think?

But the landing disk...I mentioned this all started when i was transferring hundreds of GB back and forth between traid pools.  those files i was transferring were on the problem drive.  But whats weird is that files from pool #1 were showing up on pool #2.  But i didn't transfer from pool 1 to pool 2.  How did those files get there?  It could only be landing disk. 

SO now I am questioning, what is the benefit of tha landing disk?  COpying files directly from disk to disk is faster than traid can handle anyway.  traid is usually 70-200 MB/s for me, and direct disk transfers are around that speed also.  My landing disk is SSD, but I cant tell if it helps at all.

And do i need to consider so called "catch up" periods as the landing disk fills up?  DO i have to wait a while between large transfers?  I am not clear on a lot of this, I have read all the documentation, and I have run these traid pools for years now.  Im not worried, i just dont get it.  And no offense to Brahim but much of what he says is unclear.  But i dont pay for premium service, maybe hes more clear there.  ANyway, i dont blame him, the product works fine, its just hard to troubleshoot.  Id be more concerned if problems left data unusable, but all my problems in the past have been solved by copying files to other drives, its very easy im not worried.

But this issue is annoying because of the time it takes to verify etc., reboot, start stop arrayys, track down drives, etc.

Offline Skirge01

  • Full Member
  • ***
  • Posts: 203
  • Karma: +5/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #8 on: February 08, 2019, 01:44:33 pm »
Before continuing to troubleshoot the array itself or the copying, I would strongly recommend performing an extended test on both the landing disk's drive and the disks you've been copying from/to.  Make sure those are fine before anything else because if the hardware's bad, nothing you do in tRAID is going to solve a thing and you're wasting your time.  Be aware the extended tests can take many hours on large drives.

Once you've got this issue resolved, I'd suggest reading over the performance tuning threads and tweak your settings, performing benchmarks after each change (don't change too much at once).  Personally, I've never set up a landing disk because my files are mostly huge multi-gig media files, so I felt that it might actually slow my array down.  If I did start using one, I'd likely exclude all my media files from using the landing disk.  For comparison, my Verify Syncs show a throughput into the 700MB/s range on a regular basis (see attachment).

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #9 on: February 08, 2019, 02:07:16 pm »
Before continuing to troubleshoot the array itself or the copying, I would strongly recommend performing an extended test on both the landing disk's drive and the disks you've been copying from/to.  Make sure those are fine before anything else because if the hardware's bad, nothing you do in tRAID is going to solve a thing and you're wasting your time.  Be aware the extended tests can take many hours on large drives.

Once you've got this issue resolved, I'd suggest reading over the performance tuning threads and tweak your settings, performing benchmarks after each change (don't change too much at once).  Personally, I've never set up a landing disk because my files are mostly huge multi-gig media files, so I felt that it might actually slow my array down.  If I did start using one, I'd likely exclude all my media files from using the landing disk.  For comparison, my Verify Syncs show a throughput into the 700MB/s range on a regular basis (see attachment).
OK I'm running the extended self test now.  It will take like 10 hours.  But the disk is 100% everything so far in HDS.

I've been through all the performance steps.  But I havent tested each setting as you say, i kind of just set them all at once i suppose.
I'll try not using a landing disk.  So is the landing disk like where if you do a file transfer, the files go there first as an intermediary and then go to the final destination?  ANyway, that definitely was the problem when i was transferring the files.  Ill have to try with and without.

Normally my verify syncs are int he 700 range also.  But I have had situations, like creating the pool the first time, where it starts off slow, then after then first time its back to fast speed.  I hope that is normal.

Offline Skirge01

  • Full Member
  • ***
  • Posts: 203
  • Karma: +5/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #10 on: February 09, 2019, 09:44:02 am »
If the extended test finds something, the health will drop accordingly.  I've had that happen many times, often toward the very end of the test.

When you go through the performance guide, you'll see that Brahim talks about that dip in performance.  The goal is to get the benchmarks to be as level as possible throughout the entire process.  Playing with all those settings is how to attempt to achieve that.

What you described is how I understand landing disks to work, but, as I said, I've never used that feature.  I believe the idea is to have that speedy disk hold the data during busy times and then move the data to the proper, less speedy drive as there's less of a load on the array.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #11 on: February 09, 2019, 01:33:11 pm »
ok thanks.  i may try without landing disk, i feel like the speeds should be ok.  Ill experiment.

But good news!
THe extended test found nothing.  I also did a verify sync yesterday and it was slow like 10 MB/s.  I cancelled it after about 10-20 hours.  Now, I just ran it again after rebooting and it is back up to 400 MB/s or so.  This has happened before, I dont get it.

But what really happened was this:
I was copying lots of files back and forth between two traid pools.  LIke 50GB every hour or so back and forth.
Then, something happened, and the transfers slowed to a crawl.
THe problem disk had some of those files I was transferring all screwed up.  It even transferred directly off the disk slowly.  WIndows detected a problem with the disk, and I repaired it with windows.  THen i ran the verify sync as described above.
There was actually two directories with the same exact name on the disk, that was odd.  I deleted one.

ANyway, looks like everything back to normal and once the verify sync is done, it will be ok.  THis is what I like about traid, despite some of the support complexities.  The data is always fine, its the configuration that I spend the time with.  THroughout this process, I wasn't that worried about losing files.  Just some of the technical features are hard for me to understand.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #12 on: February 09, 2019, 01:48:03 pm »
I guess if there is a "bug" in all this, its whatever happened when transferring files between the pools a lot.  I'd describe as the landing disk filled up or couldn't keep up or something like that.  And then the engine got confused as to what goes where.  NOw, sometimes when Im doing these kinds of transfers, I feel like I should do a transfer, wait for the queue to catch up, and then do it again.  But I would like to NOT factor or consider that.  I'd like to just transfer immediately and often without worry.

Offline Skirge01

  • Full Member
  • ***
  • Posts: 203
  • Karma: +5/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #13 on: February 10, 2019, 09:56:19 am »
Glad you got it figured out.  Don't know if it's because I don't use a landing disk, but I never have an issue transferring 10s of GBs to/from the array or within itself.  You could try running without one and see how your performance is.  If all is good, you could put it back in play, but make sure you have your largest files skipping the landing disk.

Offline pooler1

  • Full Member
  • ***
  • Posts: 111
  • Karma: +0/-0
    • View Profile
Re: All of a sudden: very slow copy speeds, slow verify sync
« Reply #14 on: February 10, 2019, 01:09:44 pm »
Glad you got it figured out.  Don't know if it's because I don't use a landing disk, but I never have an issue transferring 10s of GBs to/from the array or within itself.  You could try running without one and see how your performance is.  If all is good, you could put it back in play, but make sure you have your largest files skipping the landing disk.
you know, i think you are right.  I just have to try it and see.
BTW, I'm still verify syncing.  It has slowed back down to about 45.  But its ok, Ive been through this before.  It is doing something, once it gets through in a few days, it should be back to normal.
THe other option is i can just remove the configuration and redo the whole thing.  But i dont know if that would make the verify sync any faster.  I dont see why, but I also want to say configurations made from scratch always are pretty fast with the verify sync.