Author Topic: small files in the pool get corrupted when data is appended  (Read 6634 times)

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #15 on: June 22, 2015, 02:01:15 pm »
You can try disabling caching on the pool and see if that helps.
Without fully understanding the program, I can't tell you for certain the exact source of the issue. However, I personally would urge you to rewrite the program to conform to standard I/O boundaries. Even in C#, Java, and other high level programming languages, users have control over the buffer and its content along with how much to write out of the buffer. I suspect the same is possible in Delphi (http://www.delphibasics.co.uk/RTL.asp?Name=BlockWrite).

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #16 on: July 10, 2015, 01:59:45 pm »
Hi, I have done some more testing. Disabling pool caching did not help. Furthermore, other write strategies did not help. Did I mention that the problem affects only writes to the pool, but not writes to individual drives within the pool?

One thing helped: inserting a 2-seconds-delay between opening the file and starting the write commands. Looks like without the delay, the writes are executed faster than the file buffer can be filled.

However, I do not like this solution very much. Is there a way to modify Flexraid so the write commands are always excuted after filling the file buffer? Should I disable TCQ or SWO? Modify other settings. 

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #17 on: July 10, 2015, 06:49:59 pm »
TCQ and SWO apply to the RAID and not the pool. If writing directly to the tRAID disks does not exhibit the issue, then changing those settings won't change much.
I really don't see how waiting a few seconds after a file is open affects the behavior of the  I/O buffer.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #18 on: August 04, 2015, 03:41:45 pm »
Well, disabling TCQ does indeed make a difference. With TCQ disabled, the problem with the headers of small files getting overwritten by other files' data when an append operation is performed, does not occur. I have tested it repeatedly (without the wait period described below). Without TCQ, a wait is not necessary to prevent the file corruption.

It seems that an I/O buffer does not get filled with the right data, and when the file is written back, data that happened to be in the buffer from previous operations gets incorporated into the file.

As I understand, the nature of TCQ is that the order, in which I/O commands are executed, is optimized to achieve the fastest hard drive responses. That means, I/O operations are not always executed in the order in which they are requested, but rather, in the order in which it is easiest for the drive. Therefore, it would be at least plausible that a read command to fill an I/O buffer may be postponed by the TCQ algorithm, resulting in outdated data remaining in the I/O buffer, and subsequent operations may then incorporate the outdated bytes into the file being written. 

Does TCQ always make sure that several I/O operations on the same file are executed in the correct logic order rather than the fastest executable order?

Any ideas?




Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #19 on: August 04, 2015, 07:00:57 pm »
You do realize that your reports are now conflicting.
First, you reported that the issue only happened through the pool and that when writing directly to the tRAID disks, everything was fine.

However, now TCQ is affecting things?
TCQ is an aspect of the RAID and has nothing to do with the pool. If TCQ affects things, you should see the issue when writing to the tRAID disks directly.

You need to truly isolate where and when the issue happens for you by process of elimination. If it is the combo of TCQ and the pool, then that needs to be made clear too.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #20 on: August 08, 2015, 09:22:17 am »
Thanks for replying. I am sorry if my reports caused confusion. I have gone through several rounds of elimination, and things were exactly as reported. Not all tests were done when I contacted the forum the first time, so further test results came up during the process. To summarize:

1. The problem occurred when writing to the pool. 

2. The problem did not occur when writing to the disks directly (even with the pool online at the same time).

3. The probem was also eliminated by disabling TCQ while writing to the pool.

So I conclude at my current state of knowlegde, that the problem is associated with the combo of pool and TCQ.

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #21 on: August 09, 2015, 05:45:48 pm »
Good. So, as per your report the combo of the pool and TCQ is what is causing the issue you are reporting.

A good thing would be to repeat the tests without the pool one more time. That would help focus the investigation.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #22 on: August 17, 2015, 03:30:50 pm »
I have done more tests. I can confirm: the issue occurs only when the write goes through the pool drive (V:\) and TCQ is enabled.
Changing the write path to a certain drive within the pool prevents the issue. That means, I only changed the drive letter in the write command. The pool was not stopped.

Today, I have replaced the SATA controller driver that was supplied with the Windows Homer Server 2011 (Server 2008R2) with the SATA AHCI driver that was supplied by the chipset manufacturer (AMD). This became necessary because the original driver did not accept 4 TB drives.

I am not sure if this will have an effect on the reported issue. Therefore, with the new driver, I am going to repeat the tests and I will let you know about the results when I am done.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #23 on: August 18, 2015, 03:49:13 pm »
The new AHCI driver did not help. The issue came right back.

To summarize:

The issue occurs when a small file is appended on a pool drive, using the pool drive letter and TCQ turned on. The old part of the file gets replaced by garbage that happend to be in the buffer before. That means, the system fails to load the old part of the file into the buffer fast enough when the file is opened, and when the file is closed soon thereafter, the random buffer contents get to replace the file contents.

The issue is prevented by
- disabling TCQ
- using a real drive letter rather than the pool drive letter in the write path
- writing to a larger file (which does not fit completely into the buffer?)
- inserting a delay of 1 or 2 seconds between opening a file and the next file operations.

I would be happy to do more tests if told what should be tested. Now, I am going to turn off TCQ, because I need to protect the data in the pool.

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #24 on: August 18, 2015, 03:52:42 pm »
Do you have file caching enabled on the pool? Or is it only metadata caching?

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #25 on: August 19, 2015, 01:42:41 pm »
Only metadata caching. Have also tried no caching at all, but this had no influence on this issue.

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #26 on: August 19, 2015, 01:52:08 pm »
How about the effect of SWO? That is, in case you had SWO + TCQ, what about TCQ but no SWO?

There will be a test build released this coming weekend with pool driver update.We'll see if that changes anything.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #27 on: August 26, 2015, 12:02:02 pm »
First, I have tested TCQ=true, SWO= false. This prevented the issue, as did TCQ=false, SWO=true.

Then I have set it to TCQ=true, SWO= true and installed the new tRaid version of Aug 22nd. Thereafter, I found the following:

1. The drives in the pool are no longer accessible by their individual drive letters but the pool drive works normally

2. The issue is, unfortunately, NOT SOLVED. When data is appended to samll files in the pool, their first part can get overwritten by data originating from other files that had been accessed earlier. It is the same as described earlier in this thread.

Online Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #28 on: August 26, 2015, 02:01:32 pm »
So, to recap: TCQ + SWO + Storage Pool is the combo that causes the issue for you. Correct?

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #29 on: August 27, 2015, 02:45:44 pm »
Correct.