Author Topic: small files in the pool get corrupted when data is appended  (Read 7009 times)

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Hi,
I have filed a bug report a few weeks ago and done some more testing resulting in the following: Small files are being corrupted when they are modified, they are overwritten in these occasions with data originating from other files. For example, the first part of a log file may be overwritten with a header from an open office file. This happens when the files are accessed thru the pool drive letter. When they are accessed thru their original drive letters, everything works fine. Running the Sync program to flush the caches did not prevent this.
I hope for a solution, otherwise, I will have to stop using the pool drive, because data integrity is important. That's why I am using Flexraid, actually.  Why is nobody responding to the bug report?
Regards

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #1 on: May 23, 2015, 06:11:46 pm »
Data integrity is primordial. No one would use the product is they were affected.
You need to backtrack and look at your system and configuration. If using the pool with caching enabled, make sure that the tRAID disks are not being mounted.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #2 on: May 25, 2015, 03:54:00 pm »
Thanks, Brahim. That's why I am so alarmed. No disk was mounted, they all reside in the server and are not touched.

I found this because I am logging data with this computer from several sources using small programs written in Free Pascal which write records of 100...200 bytes into log files. When I had several occasions of corrupted log files, I wrote a test program writing mock data to a log file to test things out. The test program writes 222 bytes to its log file every time it is started. What I found:
1. When I start the program by hand, it works fine, no errors.
2. When the program is started by the Windows scheduler every day like my real logging programs, I get a corrupted file every time when the log file size increases from 444 to 666 bytes. Larger files seem protected.
3. The log file must reside in the pool and be accessed thru the pool for the error to occur.
4. The data corruption consists of the first ~500 bytes of the log file overwritten by a header from another file. Frequently, open office file headers were found. Correct data generally follows a the end of the log file.
5. When the test program writes to a network drive instead, there is no file corruption.
6. When the test program writes to a local drive not part of the pool, there is no file corruption.
7. When the test program writes to a drive which is part of the pool, but uses the real drive letter of the drive instead of the pool drive letter, there is no file corruption.
8. Flushing the caches with Sync or disabling the pool caching did not prevent the file corruption

This is my system: A HP Proliant Microserver N54L with AMD CPU, 8GB ECC RAM, Windows Home server on a Samsung 840 Pro SSD, tRAID, 4 NAS-Ready hard drives with 2 TB each, one is the parity drive, two data drives in the pool, one drive is not in the pool. There is no open office installed on the server, but on the network clients, and open office files are being stored on the server's pool drives by the clients.

If you have any further questions I will answer them as good as I can. I want to get rid of this porblem, which gives me a headache since the integrity of my data seems to be unsafe at the moment. I would like to continue with tRAID, which I like a lot.

Thanks a lot!!!

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #3 on: May 26, 2015, 03:04:56 pm »
I am noting that you said the issue happens when using the pool. However, if you have not done so yet, try stopping the pool, mounting a tRAID disk, and accessing that disk differently to see if the issue persist even then.
The point to is determine whether the issue is at the pool level or at the RAID level.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #4 on: May 27, 2015, 03:20:57 pm »
When I set the logging program to access the file through the pool drive, such as v:\solar\logfile.txt, then the described error occurs when the logfile is appended and the file size increases from 444 to 666 bytes.  However, if I do the same append operation while accessing the drive directly through its real drive letter, such as e:\solar\logfile.txt, I have never seen an error, even without stopping the pool.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #5 on: May 28, 2015, 03:15:09 pm »
Interesting.
The only scenario you have to watch out for is having caching on the pool turn on while the tRAID disks are mounted.
Outside of that, nothing should cause corruption at the pool level.

Can't think of anything else at the moment.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #6 on: May 29, 2015, 10:50:26 am »
Well, disabling caching on the pool did not prevent the file corruption.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #7 on: June 07, 2015, 04:46:12 pm »
Hi Brahim,

I was wondering whether you have found any idea about this persisting problem.

I have reproduced the problem repeatedly, and give you a more detailed description, maybe this gives you a clue: when a small file (seems to be less then 512 bytes as the limit) is appended past the 512 bytes length, the old part of that file, exactly to the previous length, gets overwritten by a header from another file.  It seems that the computer dumps whatevcer contents that reside in some cache into the file. Happens only when the file is accessed through the pool drive letter, and it seems to be associated with early operations after wakeup from suspend, run through the Windows scheduler. Later, hand started file access operations do not seem to reproduce the error.

Maybe this helps. For the time being, would it be a good idea to use traid without using the pool, i.e. sharing the directories from the real drive letters rather than from the pool? I am a little afraid of losing more important data some day. Without the pool, life would be less convenient, though.
Regards




Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #8 on: June 07, 2015, 09:13:02 pm »
@uhthome
I have not been able to replicate the issue, but I might be getting the scenario wrong.
I did 512 bytes appends to a file without issues.

Could you try describing the setup in more details once more?

1. What's the size of the starting file?
2. What's the write size? 512 bytes or a multiple of?
3. Is it a sequential write?
4. Is the write offset 512 bytes aligned?

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #9 on: June 09, 2015, 02:59:28 pm »
Hi Brahim,

I am sorry for not making things clear, so I am happy to clarify now.

I have observed the problem under different circumstances but always only with small files. Then I created a test program that reproduces the error. the test program works as follows:
1. The size of the starting file is 444 bytes
2. The append size is 222 bytes, giving a total length after the write of 666 bytes. During this write, the first 444 bytes are erroneously replaced by a header from another file. The last 222 bytes are written correctly. Repeatedly I have seen open office headers being inserted into the frist 444 bytes. While open office is not installed on the server, open office documents are being stored on the V: drive network shares.
3. I think it is a sequential write
4. No, it is not.

So the server replaces the first part of the file with random data that happens to be around , and appends correct data. After the file has grown to 666 bytes, all further append operations are performed correctly, the error is not being repeated. However, I see the error again when I restart the process with another 444-byte-file.

I hope I have made things clear. if not, please ask.

The setup is a HP Proliant Microserver N54L with AMD CPU, 8GB ECC RAM, Windows Home server on a Samsung 840 Pro SSD, tRAID, 4 NAS-Ready hard drives with 2 TB each, one is the parity drive, two are data drives in the pool, one drive is not in the pool.

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #10 on: June 09, 2015, 03:49:12 pm »
In your test, what is the size of the write buffer?

I think your issues might be related to how you wrote your program.
When you allocate memory for a buffer, it contains garbage data. All operating systems require that write buffers be 512 bytes aligned. So, when writing only 222 bytes, the question now becomes in what way are you doing so?

Basically, you simply CANNOT write 222 bytes to a file. What you need to do is read 512 bytes from the file into a buffer, write the 222 bytes into the buffer, and then write the buffer to the file.
Your test program and its flaw basically sent you to a different course than what your original issue is.

Post your test program if you wish.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #11 on: June 13, 2015, 03:19:23 pm »
Hi, Thanks, that's interesting. I have been running logging program like that for years, never hat this problem, and now, the problem still only occurs on the v:drive.

Where can I find the write buffer size? I do not find it in the traid settings.

The programs are using functions of the FPC free pascal compiler. I do not have to define and load buffers there.  I should be built in the exe code by the compiler?

The test program is attached.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #12 on: June 21, 2015, 05:50:22 am »
Hi Brahim,
the sequence you describe for writing 222 bytes should be done by the operating system and/or Flexraid's drivers. There should be no need to program this in Pascal. I have been doing this kind of programs for a decade and never had such a probblem. Furthermore, it should not be overlooked that the data loss problem occurs only when writing to the pool drive. Never has a write operation to another drive caused this problem, even if the drive was part of the pool.

This issue endangers data integrity of small files. Is there any reason not to fix it? The reason I am using flexraid is data protection, and now the protector causes corruption of small files. We would be better off if this would be fixed. For my sake, I cannot use the pool any more until then.

I have submitted this problem also to the developers of Free Pascal, the complier that I have used recently. Here is what they wrote (see also http://mantis.freepascal.org/view.php?id=27670, last entries):

"You never need to read-modify-write sequence for files, such things are entirely covered by OS [core/drivers/cache]. Even in MS-DOS etc..

>> "Basically, you simply CANNOT write 222 bytes to a file. What you need to do is read 512 bytes from the file into a buffer, write the 222 bytes into the buffer, and then write the buffer to the file."

Surely You CAN. OS always does this sequence for you, if it ever needed.
This sentence only confirms a bug in a lower level (possibly, driver that handles "pooling").

Neither the compiler, nor RTL does care of what the file-alignment is for the data writing to a file -- it is entirely out of their scope.

[Edit: "All operating systems require that write buffers be 512 bytes aligned." I'm sad.. ;-) Indeed, it is an OS job to make sure its write buffers are properly aligned, but not all way around.]"

And:
"I agree with Max. If what he says is true, it is a requirement for interaction between OS and the flexraid block devicedriver, not applications and flexraid.
Applications are not even supposed to be aware of the blocksize"

Therefore, please check the flexraid block device driver. Thanks a lot!

Offline Brahim

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8,547
  • Karma: +204/-16
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #13 on: June 21, 2015, 07:45:00 am »
I am not familiar with Delphi. So, I can't draw anything from the program.

What I meant to say above is that you can't write 512 bytes to a disk (not a file). What I/O APIs do is use a buffer that is 512 bytes aligned. All writes go to that buffer first and then that buffer is written to disk in a 512 bytes aligned fashion either when the buffer fills up or on flush operations.
The combo of your write offset, write size, and write buffer has implications into what's actually get written to disk. If you do any seek operations, things get even more tricky as far as what becomes of the write offset.

All high level programming I/O APIs provide for a default buffer and let you specify a custom one. The write offset is something you can specify or it becomes implicit and managed by the API as a simple advancing value (starts at 0 and increments by the writes size).

If you don't understand these details, you are bound to make classic developer mistakes when dealing with I/O operations.

Re-write the program in C/C++ and things will become more apparent.

Offline uhthome

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
    • View Profile
Re: small files in the pool get corrupted when data is appended
« Reply #14 on: June 21, 2015, 12:23:05 pm »
OK, I understand. Too bad that you are not familiar with Pascal/Delphi. Myself, I am unfortunately not familiar wirth C/C++.
But from what you wrote and what I observed, we can draw the following conclusions:

When one appends data to a file while accessing it through the pool drive, we get a problem when the file fits into one 512 byte buffer before the append operation, and exceeds 512 bytes after the append operation.

The problem is then that exactly the part of the file which is before the write offset point is being replaced by something else. What is something else? I frequently found contents of files that had been edited the day before, including contents of exe files that had been going through buffers because I had modified them with a hex editor the day before, or headers of open office documents, in these text files.   

I suspect a possible mechanism is that the file was not properly loaded into the buffer, so whatever was lying around by chance in the buffer prevails and is written back to the disk when the file is closed, thereby replacing the original header followed by the new, appended data.   

I have no idea how I can prevent this in Pascal, because all the buffer handling happens under the surface, by the OS and its drivers. As I said, happens only when using the pool. Is is possible that some driver dealing witht he v:\ pool fails to properly load file contents into the buffer if the file is smaller than 512 bytes?

I hope these details are of help!