Hadoop Common
  1. Hadoop Common
  2. HADOOP-4292

append() does not work for LocalFileSystem

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.18.0, 0.19.0
    • Fix Version/s: 0.18.2
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      append is supported by LocalFileSystem but it does not update crc when a file is appended.

      When you enable checksum verification TestLocalFileSystem.testAppend fails. Since HADOOP-4277 is a blocker for 0.17 I am planning to disable this test in HADOOP-4277.

      1. LFSAppend.patch
        2 kB
        Hairong Kuang

        Issue Links

          Activity

          Hide
          Raghu Angadi added a comment -

          This affects 0.18 as well. For 18, the fix could be as simple as just throwing a 'not supported' exception.

          Show
          Raghu Angadi added a comment - This affects 0.18 as well. For 18, the fix could be as simple as just throwing a 'not supported' exception.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          There is another bug in LocalFileSystem:
          When creating a file with LocalFileSystem.create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress), no checksum will be created.

          Show
          Tsz Wo Nicholas Sze added a comment - There is another bug in LocalFileSystem: When creating a file with LocalFileSystem.create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress), no checksum will be created.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Created HADOOP-4326 for the LocalFileSystem.create(...) bug.

          Show
          Tsz Wo Nicholas Sze added a comment - Created HADOOP-4326 for the LocalFileSystem.create(...) bug.
          Hide
          Hairong Kuang added a comment -

          The complexity of append exists when the last checksum chunk is partial. In this case, I plan to first read the partial chunk out, then truncate the data file to exclude the last chunk and truncate the crc file to exclude the last 4 bytes. Afterwards write the partial chunk again and data to be append.

          Since LocalFileSystem is a ChecksumFileSystem, I plan to implement the above algorithm in ChecksumFileSystem so it is general to any checksum file system. For any checksum file system to support append, its raw file system needs to implement a TruncatableFileSystem interface that supports a truncate method:

          FSDataOutputStream truncate(Path file, long len);

          This method sets the give file's length be the given len and returns a FSDataOutputStream, whose current position is equal to the given len.

          Show
          Hairong Kuang added a comment - The complexity of append exists when the last checksum chunk is partial. In this case, I plan to first read the partial chunk out, then truncate the data file to exclude the last chunk and truncate the crc file to exclude the last 4 bytes. Afterwards write the partial chunk again and data to be append. Since LocalFileSystem is a ChecksumFileSystem, I plan to implement the above algorithm in ChecksumFileSystem so it is general to any checksum file system. For any checksum file system to support append, its raw file system needs to implement a TruncatableFileSystem interface that supports a truncate method: FSDataOutputStream truncate(Path file, long len); This method sets the give file's length be the given len and returns a FSDataOutputStream, whose current position is equal to the given len.
          Hide
          Hairong Kuang added a comment -

          Specifically what's in my mind is to introduce the following interface:

          TruncatableFileSystem.java
          interface TruncatableFileSystem {
            // set the file length to be the given len and the current position to be the given len
            FSDataOutputStream truncate(Path file, long len);
          

          RawLocalFileSystem will implement TruncatableFileSystem.

          In ChecksumFileSystem.append, if the raw file system is not a TruncatableFileSystem, throw an exception indicating that the append is not supported. Otherwise, read the last partial trunk, truncate both the data file and crc file, rewrite the last partial trunk.

          Show
          Hairong Kuang added a comment - Specifically what's in my mind is to introduce the following interface: TruncatableFileSystem.java interface TruncatableFileSystem { // set the file length to be the given len and the current position to be the given len FSDataOutputStream truncate(Path file, long len); RawLocalFileSystem will implement TruncatableFileSystem. In ChecksumFileSystem.append, if the raw file system is not a TruncatableFileSystem, throw an exception indicating that the append is not supported. Otherwise, read the last partial trunk, truncate both the data file and crc file, rewrite the last partial trunk.
          Hide
          Raghu Angadi added a comment -

          truncate implies that if the first append() fails, user can actually lose data that existed in the file before append().

          Show
          Raghu Angadi added a comment - truncate implies that if the first append() fails, user can actually lose data that existed in the file before append().
          Hide
          Hairong Kuang added a comment -

          Raghu's comment makes sense. How about we truncating only the crc file not the data file? So data is not at the risk of losing.

          Show
          Hairong Kuang added a comment - Raghu's comment makes sense. How about we truncating only the crc file not the data file? So data is not at the risk of losing.
          Hide
          Hairong Kuang added a comment -

          Another option is that we could have a positional write interface instead of having a truncate interface. Something like:

          interface PositionalWritableFileSystem {
             // Return an output stream with the given pos as its current position; next write will write to the current position
             FSOutputStream append(File file, long pos);
          }
          
          Show
          Hairong Kuang added a comment - Another option is that we could have a positional write interface instead of having a truncate interface. Something like: interface PositionalWritableFileSystem { // Return an output stream with the given pos as its current position; next write will write to the current position FSOutputStream append(File file, long pos); }
          Hide
          dhruba borthakur added a comment -

          The "append" design (HADOOP-1700) allows for the fact that we coudl support multiple concurrent appenders in future. This means that handling the last partial chunk is best handled in the server and not in the client.

          Show
          dhruba borthakur added a comment - The "append" design ( HADOOP-1700 ) allows for the fact that we coudl support multiple concurrent appenders in future. This means that handling the last partial chunk is best handled in the server and not in the client.
          Hide
          Doug Cutting added a comment -

          As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated.

          An alternative would be to re-write the entire CRC file when truncate is not possible.

          Should we add truncate() to FileSystem.java instead of to a new interface?

          Why does the truncate method return an FSDataOutputStream? Perhaps it could return a boolean, indicating whether it was successful. Then the default implementation could simply return false. Or it could return void and the default would throw UnsupportedOperationException.

          Show
          Doug Cutting added a comment - As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated. An alternative would be to re-write the entire CRC file when truncate is not possible. Should we add truncate() to FileSystem.java instead of to a new interface? Why does the truncate method return an FSDataOutputStream? Perhaps it could return a boolean, indicating whether it was successful. Then the default implementation could simply return false. Or it could return void and the default would throw UnsupportedOperationException.
          Hide
          Raghu Angadi added a comment -

          > The "append" design (HADOOP-1700) allows for the fact that we coudl support multiple concurrent appenders in future. This means that handling the last partial chunk is best handled in the server and not in the client.

          ChecksumFileSystem is managed only on the client. It won't work for multiple appenders, just like it does not work for multiple writers now. Note that DFSOutputStream is not a ChecksumFileSystem. Not sure if this is related to HADOOP-1700 at all.

          Show
          Raghu Angadi added a comment - > The "append" design ( HADOOP-1700 ) allows for the fact that we coudl support multiple concurrent appenders in future. This means that handling the last partial chunk is best handled in the server and not in the client. ChecksumFileSystem is managed only on the client. It won't work for multiple appenders, just like it does not work for multiple writers now. Note that DFSOutputStream is not a ChecksumFileSystem. Not sure if this is related to HADOOP-1700 at all.
          Hide
          Raghu Angadi added a comment -

          > Note that DFSOutputStream is not a ChecksumFileSystem.
          I meant to say DFS is not a ChecksumFileSystem.

          Show
          Raghu Angadi added a comment - > Note that DFSOutputStream is not a ChecksumFileSystem. I meant to say DFS is not a ChecksumFileSystem.
          Hide
          Hairong Kuang added a comment -

          The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs. But if this is OK with everybody, it is fine with me too. If truncate returns an FSDataOutput, next write does not need to open it for appending first.

          Show
          Hairong Kuang added a comment - The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs. But if this is OK with everybody, it is fine with me too. If truncate returns an FSDataOutput, next write does not need to open it for appending first.
          Hide
          Doug Cutting added a comment -

          truncate does seem like a common-enough file operation to put in FileSystem.java.

          > If truncate returns an FSDataOutput, next write does not need to open it for appending first.

          My intuition is to make this two separate operations: truncate, and open-for-append. Do you have a strong reason to connect them?

          Show
          Doug Cutting added a comment - truncate does seem like a common-enough file operation to put in FileSystem.java. > If truncate returns an FSDataOutput, next write does not need to open it for appending first. My intuition is to make this two separate operations: truncate, and open-for-append. Do you have a strong reason to connect them?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated.

          Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?

          > The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs.

          There will be a lot of changes. Are we going to apply all these changes to 0.18 and onwards?

          Show
          Tsz Wo Nicholas Sze added a comment - > As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated. Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this? > The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs. There will be a lot of changes. Are we going to apply all these changes to 0.18 and onwards?
          Hide
          Raghu Angadi added a comment -

          > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?
          We could seek to the right position and start writing from there. If that is not allowed, we could copy the crc file.

          For this jira, we can implement this for LocalFS (and any simpler Filesystems), the default implementation at ChecksumFS could be to throw unsupported exception.

          Show
          Raghu Angadi added a comment - > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this? We could seek to the right position and start writing from there. If that is not allowed, we could copy the crc file. For this jira, we can implement this for LocalFS (and any simpler Filesystems), the default implementation at ChecksumFS could be to throw unsupported exception.
          Hide
          Doug Cutting added a comment -

          > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?

          With ChecksumFileSystem, lots of failures can result in data-checksum inconsistency. I don't think this makes the situation particularly worse. Perhaps we could change the checksum file's format so that it includes a trailing element that includes the file's length or somesuch, to avoid such false-positives.

          > There will be a lot of changes [ ... ]

          More than the addition of a single method to FileSystem.java?

          Show
          Doug Cutting added a comment - > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this? With ChecksumFileSystem, lots of failures can result in data-checksum inconsistency. I don't think this makes the situation particularly worse. Perhaps we could change the checksum file's format so that it includes a trailing element that includes the file's length or somesuch, to avoid such false-positives. > There will be a lot of changes [ ... ] More than the addition of a single method to FileSystem.java?
          Hide
          Hairong Kuang added a comment -

          > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?

          Append is an operation that changes the content of file. So it has a potential risk that may lead to data-crc inconsistency. This is a problem for append & write. I do not plan to get it solved in this issue.

          > My intuition is to make this two separate operations: truncate, and open-for-append.
          This is fine for me.

          Does the community feel comfortable for adding the method truncate to FileSystem?

          Show
          Hairong Kuang added a comment - > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this? Append is an operation that changes the content of file. So it has a potential risk that may lead to data-crc inconsistency. This is a problem for append & write. I do not plan to get it solved in this issue. > My intuition is to make this two separate operations: truncate, and open-for-append. This is fine for me. Does the community feel comfortable for adding the method truncate to FileSystem?
          Hide
          dhruba borthakur added a comment -

          +1 on introducing FileSystem.truncate(Path, offset)

          Show
          dhruba borthakur added a comment - +1 on introducing FileSystem.truncate(Path, offset)
          Hide
          Owen O'Malley added a comment -

          -1 on introducing FileSystem.truncate in 0.19.

          I think that this patch should make ChecksumFileSystem.append throw UnsupportedOperationException.

          If someone wants to make it work in 0.20 that is great. But I don't think we should push it into 0.19.

          Show
          Owen O'Malley added a comment - -1 on introducing FileSystem.truncate in 0.19. I think that this patch should make ChecksumFileSystem.append throw UnsupportedOperationException. If someone wants to make it work in 0.20 that is great. But I don't think we should push it into 0.19.
          Hide
          Doug Cutting added a comment -

          Owen> -1 on introducing FileSystem.truncate in 0.19.

          I agree. I didn't notice earlier that this was targeted for 0.19. Append for the local fs is a new feature, and should go in 0.20. Truncate for any fs is clearly a new feature.

          Show
          Doug Cutting added a comment - Owen> -1 on introducing FileSystem.truncate in 0.19. I agree. I didn't notice earlier that this was targeted for 0.19. Append for the local fs is a new feature, and should go in 0.20. Truncate for any fs is clearly a new feature.
          Hide
          Hairong Kuang added a comment -

          This patch marks "append" as "not supported" in ChecksumFileSystem. As a result, append is not supported in LocalFileSystem.

          Show
          Hairong Kuang added a comment - This patch marks "append" as "not supported" in ChecksumFileSystem. As a result, append is not supported in LocalFileSystem.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12392037/LFSAppend.patch
          against trunk revision 704261.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          -1 core tests. The patch failed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12392037/LFSAppend.patch against trunk revision 704261. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          I just committed this.

          Show
          Hairong Kuang added a comment - I just committed this.

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development