Avro
  1. Avro
  2. AVRO-524

DataFileWriter.appendTo leads to intermittent IOException during write()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.3
    • Component/s: java
    • Labels:
      None

      Description

      To append to a data file, we first open the file as RandomAccessFile in read-write mode, read some information such sync, seek to the end of the file and then use its FileDescriptor to create a FileOutputStream. Sharing a FileDescriptor this way could lead to problem if one of its containers is garbage-collected while the other is still in use. Please see: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6322678. The bug got fixed in Java 7 (b06). In our case, RandomAccessFile sometimes gets garbage-collected leading to write errors. If the Java unit-tests are run multiple times, this occurs about 25% of the time on my Windows machine and about 50% on my Ubuntu Linux box.

      1. AVRO-524.patch
        1 kB
        Thiruvalluvan M. G.

        Activity

        Hide
        Thiruvalluvan M. G. added a comment -

        The simple solution is to open the FileOutputStream in append mode from the filename rather than the FileDescriptor. Since its FileDescriptor is not going to be used for writing anymore, the RandomAccessFile can now be opened in "r" mode instead of "rw".

        Show
        Thiruvalluvan M. G. added a comment - The simple solution is to open the FileOutputStream in append mode from the filename rather than the FileDescriptor. Since its FileDescriptor is not going to be used for writing anymore, the RandomAccessFile can now be opened in "r" mode instead of "rw".
        Hide
        Thiruvalluvan M. G. added a comment -

        Committed revision 938347.

        Show
        Thiruvalluvan M. G. added a comment - Committed revision 938347.
        Hide
        Doug Cutting added a comment -

        Do we even still need RandomAccessFile and a FileHandle at all anymore? These were only needed before because it's the only way I could find to get a single file descriptor that's open for both read and write. If we're willing to close and re-open the file then the header can be read with a FileInputStream, then the appends can be made to a FileOutputStream.

        Note this is a minor, probably insignificant, change in semantics. If the file were to be renamed between the time its header reader is opened and the appender is opened then it would now fail, where before it would succeed. This seems very unlikely and not worth protecting against, but, for the record, it was fears of issues like this that led me to use RandomAccessFile, so that the file was only opened once, that read and write permission, file existence, etc. would all be only checked once. In general, re-opening files is risky, but, in this specific case, it probably isn't.

        Finally, should we perhaps close the reader in a finally clause?

        Show
        Doug Cutting added a comment - Do we even still need RandomAccessFile and a FileHandle at all anymore? These were only needed before because it's the only way I could find to get a single file descriptor that's open for both read and write. If we're willing to close and re-open the file then the header can be read with a FileInputStream, then the appends can be made to a FileOutputStream. Note this is a minor, probably insignificant, change in semantics. If the file were to be renamed between the time its header reader is opened and the appender is opened then it would now fail, where before it would succeed. This seems very unlikely and not worth protecting against, but, for the record, it was fears of issues like this that led me to use RandomAccessFile, so that the file was only opened once, that read and write permission, file existence, etc. would all be only checked once. In general, re-opening files is risky, but, in this specific case, it probably isn't. Finally, should we perhaps close the reader in a finally clause?
        Hide
        Doug Cutting added a comment -

        Also, for the record, another way to fix the bug would be to simply retain a pointer to the RandomAccessFile, no?

        Show
        Doug Cutting added a comment - Also, for the record, another way to fix the bug would be to simply retain a pointer to the RandomAccessFile, no?

          People

          • Assignee:
            Thiruvalluvan M. G.
            Reporter:
            Thiruvalluvan M. G.
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development