Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.0.2-alpha
    • Component/s: datanode, hdfs-client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, the real expected semantics should be "flushes out to all replicas and all replicas have done posix fsync equivalent - ie the OS has flushed it to the disk device (but the disk may have it in its cache)." This jira aims to implement the expected behaviour.

      1. hdfs-744.txt
        16 kB
        Lars Hofhansl
      2. hdfs-744-v2.txt
        18 kB
        Lars Hofhansl
      3. hdfs-744-v3.txt
        28 kB
        Lars Hofhansl
      4. HDFS-744-trunk.patch
        18 kB
        Lars Hofhansl
      5. HDFS-744-trunk-v2.patch
        19 kB
        Lars Hofhansl
      6. HDFS-744-trunk-v3.patch
        20 kB
        Lars Hofhansl
      7. HDFS-744-trunk-v4.patch
        21 kB
        Lars Hofhansl
      8. HDFS-744-trunk-v5.patch
        25 kB
        Lars Hofhansl
      9. HDFS-744-trunk-v6.patch
        34 kB
        Lars Hofhansl
      10. HDFS-744-trunk-v7.patch
        34 kB
        Lars Hofhansl
      11. HDFS-744-trunk-v8.patch
        34 kB
        Lars Hofhansl
      12. HDFS-744-2.0-v1.patch
        26 kB
        Lars Hofhansl
      13. HDFS-744-2.0-v2.patch
        34 kB
        Lars Hofhansl

        Issue Links

          Activity

          Hide
          dhruba borthakur added a comment -

          do you plan this now or is this a placeholder for now? Is there a use-case for this requirement?

          Show
          dhruba borthakur added a comment - do you plan this now or is this a placeholder for now? Is there a use-case for this requirement?
          Hide
          dhruba borthakur added a comment -

          We need this functionality for HBase clusters. When HBase writes data to a HLog?HFile, it better be persisted on disk, otherwise there could be data loss when a non-trivial number of datanodes are power-cycled simultaneously.

          Show
          dhruba borthakur added a comment - We need this functionality for HBase clusters. When HBase writes data to a HLog?HFile, it better be persisted on disk, otherwise there could be data loss when a non-trivial number of datanodes are power-cycled simultaneously.
          Hide
          Liyin Tang added a comment -

          Shall we do this for avatar client as well?

          Show
          Liyin Tang added a comment - Shall we do this for avatar client as well?
          Hide
          Hairong Kuang added a comment -

          I thought more about this. It seems to me it is not easy to support hsync. If the bytes written in between two hsync calls are across a block boundary. The first block needs to get fsynced to disk as well.

          An alternative proposal is to provide a filesystem create API that takes a fsync option. If fsync is true, any hflush call will force the data to be synced to disk and any block will be synced to disk upon close as well.

          Show
          Hairong Kuang added a comment - I thought more about this. It seems to me it is not easy to support hsync. If the bytes written in between two hsync calls are across a block boundary. The first block needs to get fsynced to disk as well. An alternative proposal is to provide a filesystem create API that takes a fsync option. If fsync is true, any hflush call will force the data to be synced to disk and any block will be synced to disk upon close as well.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Dhruba, how the datanode power-cycle is done? Just kill the machines?

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Dhruba, how the datanode power-cycle is done? Just kill the machines?
          Hide
          dhruba borthakur added a comment -

          @Nicholas: yeah, machines were hard-rebooted (not a clean shutdown).

          @Hairong: yes, it makes sense to have a flag in the FileSystem.create() call to indicate if a hflush() should actually sync block data from the datanode to the disk.

          Show
          dhruba borthakur added a comment - @Nicholas: yeah, machines were hard-rebooted (not a clean shutdown). @Hairong: yes, it makes sense to have a flag in the FileSystem.create() call to indicate if a hflush() should actually sync block data from the datanode to the disk.
          Hide
          stack added a comment -

          @Hairong Flag on create sounds right.

          Show
          stack added a comment - @Hairong Flag on create sounds right.
          Hide
          Liyin Tang added a comment -

          Since we are now using new Create API to force sync, there is no need to enable the syncOnClose any more. Single packet will not across block boundary.

          Show
          Liyin Tang added a comment - Since we are now using new Create API to force sync, there is no need to enable the syncOnClose any more. Single packet will not across block boundary.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Hairong,

          We also don't have a FileSystem.create(..) API which let user create a file, or append to it when the file already exists. If you are changing FileSystem.create(..), you might want to add it as well. Alternatively, we might do it in a separated JIRA.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Hairong, We also don't have a FileSystem.create(..) API which let user create a file, or append to it when the file already exists. If you are changing FileSystem.create(..) , you might want to add it as well. Alternatively, we might do it in a separated JIRA.
          Hide
          Hairong Kuang added a comment -

          @Linyin, I am ok if we simply sync every packet to disk if the sync is set as a first pass. This should work when sync calls are very frequent. If this turns out to be a performance problem for hbase, we could file a separate jira to optimize it by syncing to disk only when flush is called or at block boundary.

          @Nicholas, what your suggested seems good to me. Could we do it in a separate jira?

          Show
          Hairong Kuang added a comment - @Linyin, I am ok if we simply sync every packet to disk if the sync is set as a first pass. This should work when sync calls are very frequent. If this turns out to be a performance problem for hbase, we could file a separate jira to optimize it by syncing to disk only when flush is called or at block boundary. @Nicholas, what your suggested seems good to me. Could we do it in a separate jira?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > ... Could we do it in a separate jira?
          Sure.

          Show
          Tsz Wo Nicholas Sze added a comment - > ... Could we do it in a separate jira? Sure.
          Hide
          Lars Hofhansl added a comment -

          As Dhruba said, we need this for HBase.

          One solution would be to (1) introduce a create mode that causes every block to be sync'ed upon close and (2) have a new DataXCeiver command to sync a current outstanding block.
          Together these ensure that we can guarantee sync'ed data and still avoid having to keep track of outstanding unsync'ed blocks, and also avoid needing to sync upon receipt of every packet.
          The client would need to pass the filemode along with every packet. (And maybe we could generalize this to a Unix-like file descriptor abstraction).

          • DSFClient.create gets an extra flags argument
          • The flags are passed along to the Datanode with every Packet via DFSClient.DFSOutputStream.Packet.writeData.
          • BlockReceiver.receivePacket reads these flags. receiveBlock can then act accordingly on close and issue force on the open filechannel.
          • A new DataTransferProtocol type OP_SYNC_BLOCK (or something), that'll sync an outstanding block. Maybe OP_SYNC_BLOCK should only be allowed of the file opened on sync-on-block-close-mode.

          The next problem is pipelined syncs. Eating up the sync cost N times in a pipeline is not ideal for latency.
          Here we could:

          • (optionally) only issue a sync to disk on the last (or first) replica in the pipeline. That way we take the sync hit only once and data is on physical disk at least on of the replicas. (Could be made rack aware so that we issue sync at least on a machine per rack.) Or
          • more radically: since the order guaranteed by pipelining is only needed for appending to the block files, we could issue the sync's to all mirrors involved in parallel in a second request.

          Disclaimer: I work on HBase but am a HDFS noob.

          Show
          Lars Hofhansl added a comment - As Dhruba said, we need this for HBase. One solution would be to (1) introduce a create mode that causes every block to be sync'ed upon close and (2) have a new DataXCeiver command to sync a current outstanding block. Together these ensure that we can guarantee sync'ed data and still avoid having to keep track of outstanding unsync'ed blocks, and also avoid needing to sync upon receipt of every packet. The client would need to pass the filemode along with every packet. (And maybe we could generalize this to a Unix-like file descriptor abstraction). DSFClient.create gets an extra flags argument The flags are passed along to the Datanode with every Packet via DFSClient.DFSOutputStream.Packet.writeData. BlockReceiver.receivePacket reads these flags. receiveBlock can then act accordingly on close and issue force on the open filechannel. A new DataTransferProtocol type OP_SYNC_BLOCK (or something), that'll sync an outstanding block. Maybe OP_SYNC_BLOCK should only be allowed of the file opened on sync-on-block-close-mode. The next problem is pipelined syncs. Eating up the sync cost N times in a pipeline is not ideal for latency. Here we could: (optionally) only issue a sync to disk on the last (or first) replica in the pipeline. That way we take the sync hit only once and data is on physical disk at least on of the replicas. (Could be made rack aware so that we issue sync at least on a machine per rack.) Or more radically: since the order guaranteed by pipelining is only needed for appending to the block files, we could issue the sync's to all mirrors involved in parallel in a second request. Disclaimer: I work on HBase but am a HDFS noob.
          Hide
          Lars Hofhansl added a comment -

          Turns out no new DataTransferProtocol option is needed, the client can send a sync packet to indicate that the current block needs to be sync'ed even before it is closed.

          I'm almost done with a patch that does what I have described above. The only part remaining is allowing the client to send an empty sync packet if no data is outstanding (currently a packet with payloadLen==0 indicates EOF).

          Show
          Lars Hofhansl added a comment - Turns out no new DataTransferProtocol option is needed, the client can send a sync packet to indicate that the current block needs to be sync'ed even before it is closed. I'm almost done with a patch that does what I have described above. The only part remaining is allowing the client to send an empty sync packet if no data is outstanding (currently a packet with payloadLen==0 indicates EOF).
          Hide
          Lars Hofhansl added a comment -

          Here's a sketch of a patch against 1.0.x.
          I did only light testing on this.

          It uses the packet's lastPacketInBlock byte to send flags to the Datanode. (An old client will work against a new data node, a new client should work against an old data node as long the sync feature is not used, but it would not fail gracefully if it did.)

          The client can send two flags with a packet:

          1. sync block
          2. sync packet

          The idea is that #1 would be sent with at least one packet per block in order to sync the block upon close.
          #2 is sent by the client if a sync should be forced immediately on a partial block. If the client has outstanding data to send anyway the flag is pickbagged on the packet for that data, otherwise an empty sync packet is sent if needed.

          Together they allow a client to guarantee that all bytes up to a certain point are guaranteed on disk.

          Please have a look and let me know whether I'm off track with this.

          If not, I'll clean it up, add some tests, create a trunk patch (which I imagine would look a bit differently), and maybe add a only-the-last-replica-syncs option.

          Thanks.

          Show
          Lars Hofhansl added a comment - Here's a sketch of a patch against 1.0.x. I did only light testing on this. It uses the packet's lastPacketInBlock byte to send flags to the Datanode. (An old client will work against a new data node, a new client should work against an old data node as long the sync feature is not used, but it would not fail gracefully if it did.) The client can send two flags with a packet: sync block sync packet The idea is that #1 would be sent with at least one packet per block in order to sync the block upon close. #2 is sent by the client if a sync should be forced immediately on a partial block. If the client has outstanding data to send anyway the flag is pickbagged on the packet for that data, otherwise an empty sync packet is sent if needed. Together they allow a client to guarantee that all bytes up to a certain point are guaranteed on disk. Please have a look and let me know whether I'm off track with this. If not, I'll clean it up, add some tests, create a trunk patch (which I imagine would look a bit differently), and maybe add a only-the-last-replica-syncs option. Thanks.
          Hide
          stack added a comment -

          Patch looks good to me.

          Were you going to expose a file mode instead of a boolean 'sync' flag? Sync might be a nebulous notion at this stage in hdfs's evolution but I suppose if only one thing to set on the exposed mode, its best keep it basic for now (Rather than sync, call it fsync or 'force', the name of the method you call on java file channel?)

          The sync flag in the checksum filesystem seems to line up w/ the create parent flag in the method call. Thats probably not what you want?

          Turns out no new DataTransferProtocol option is needed, the client can send a sync packet to indicate that the current block needs to be sync'ed even before it is closed.

          Does this mean we don't need to set the sync flag for the whole filesystem such that on close we call filechannel force? We can do it on a file-by-file basis?

          Are you thinking we'd work on optimizations later, after this patch went in; e.g. sync only the first block in pipeline while we are replicating to the other pipeline members, etc.?

          Show
          stack added a comment - Patch looks good to me. Were you going to expose a file mode instead of a boolean 'sync' flag? Sync might be a nebulous notion at this stage in hdfs's evolution but I suppose if only one thing to set on the exposed mode, its best keep it basic for now (Rather than sync, call it fsync or 'force', the name of the method you call on java file channel?) The sync flag in the checksum filesystem seems to line up w/ the create parent flag in the method call. Thats probably not what you want? Turns out no new DataTransferProtocol option is needed, the client can send a sync packet to indicate that the current block needs to be sync'ed even before it is closed. Does this mean we don't need to set the sync flag for the whole filesystem such that on close we call filechannel force? We can do it on a file-by-file basis? Are you thinking we'd work on optimizations later, after this patch went in; e.g. sync only the first block in pipeline while we are replicating to the other pipeline members, etc.?
          Hide
          Lars Hofhansl added a comment -

          Thanks Stack.

          The sync flags does two things: it causes blocks to be fsync'ed upon close at the DN, and it causes the DN to fsync right away when the client issues a sync().
          I can call it force (the closest posix equivalent would be O_SYNC). I can also add a FileMode (which eventually could include append, overwrite, and sync, but would only do sync for now).

          Does this mean we don't need to set the sync flag for the whole filesystem such that on close we call filechannel force? We can do it on a file-by-file basis?

          Yep.

          The sync flag in the checksum filesystem seems to line up w/ the create parent flag in the method call. Thats probably not what you want?

          Whoops. Yeah, checksum filesystem had a private create method that happened to match the signature. Will fix. Means the signatures in Filesystem need to be different.

          Are you thinking we'd work on optimizations later, after this patch went in; e.g. sync only the first block in pipeline while we are replicating to the other pipeline members, etc.?

          That'd be a bigger change. Need to sync in a separate thread, then flush to replicas, then wait for sync and flush to finish. What I had in mind was to sync synchronously at one of the replicas.

          Also noticed that new create method in FileSystem calls itself, rather than the one that does not have the sync flag.

          Show
          Lars Hofhansl added a comment - Thanks Stack. The sync flags does two things: it causes blocks to be fsync'ed upon close at the DN, and it causes the DN to fsync right away when the client issues a sync(). I can call it force (the closest posix equivalent would be O_SYNC). I can also add a FileMode (which eventually could include append, overwrite, and sync, but would only do sync for now). Does this mean we don't need to set the sync flag for the whole filesystem such that on close we call filechannel force? We can do it on a file-by-file basis? Yep. The sync flag in the checksum filesystem seems to line up w/ the create parent flag in the method call. Thats probably not what you want? Whoops. Yeah, checksum filesystem had a private create method that happened to match the signature. Will fix. Means the signatures in Filesystem need to be different. Are you thinking we'd work on optimizations later, after this patch went in; e.g. sync only the first block in pipeline while we are replicating to the other pipeline members, etc.? That'd be a bigger change. Need to sync in a separate thread, then flush to replicas, then wait for sync and flush to finish. What I had in mind was to sync synchronously at one of the replicas. Also noticed that new create method in FileSystem calls itself, rather than the one that does not have the sync flag.
          Hide
          Lars Hofhansl added a comment -

          Renamed "sync" to "force" added a few more comments. Fixed ChecksumFileSystem.create signature issue. Fixed endless loop when FileSystem.create with force is called on FS other than DFS.

          Show
          Lars Hofhansl added a comment - Renamed "sync" to "force" added a few more comments. Fixed ChecksumFileSystem.create signature issue. Fixed endless loop when FileSystem.create with force is called on FS other than DFS.
          Hide
          Lars Hofhansl added a comment -

          More changes needed for HBase (SequenceFiles need to support the force flag).

          I made matching changes in HBase and will test this in a real cluster.

          The patch has become big. There is no feedback, yet, from any HDFS folks.

          Without a design approval more work might go to waste.

          Show
          Lars Hofhansl added a comment - More changes needed for HBase (SequenceFiles need to support the force flag). I made matching changes in HBase and will test this in a real cluster. The patch has become big. There is no feedback, yet, from any HDFS folks. Without a design approval more work might go to waste.
          Hide
          Todd Lipcon added a comment -

          Hi Lars. Sorry, I wasn't watching this before so I missed your work til you mentioned it on the HBase JIRA. I'll try to take a look at this this week. Feel free to grab me on IRC if you have specific questions.

          Show
          Todd Lipcon added a comment - Hi Lars. Sorry, I wasn't watching this before so I missed your work til you mentioned it on the HBase JIRA. I'll try to take a look at this this week. Feel free to grab me on IRC if you have specific questions.
          Hide
          Todd Lipcon added a comment -

          One quick note, though: in order for this to be reviewed/committed, the patch needs to be on trunk. You'll find that on trunk we use protobufs for the data transfer protocol, so adding a new sync packet type/flag should be much simpler! I think we should work on trunk to iron out the APIs, and then figure out how to shoehorn it into the 1.x protocol if need be at a later date.

          Show
          Todd Lipcon added a comment - One quick note, though: in order for this to be reviewed/committed, the patch needs to be on trunk. You'll find that on trunk we use protobufs for the data transfer protocol, so adding a new sync packet type/flag should be much simpler! I think we should work on trunk to iron out the APIs, and then figure out how to shoehorn it into the 1.x protocol if need be at a later date.
          Hide
          Lars Hofhansl added a comment -

          Thanks Todd. I'll create a trunk patch tomorrow (hopefully).

          Show
          Lars Hofhansl added a comment - Thanks Todd. I'll create a trunk patch tomorrow (hopefully).
          Hide
          Lars Hofhansl added a comment -

          Here's a patch against trunk.
          Principle is the same, CreateFlag now also has a FORCE flag.

          It compiles and I did test with a simple client.

          I have not managed to get HBase running with Hadoop trunk, yet.

          Show
          Lars Hofhansl added a comment - Here's a patch against trunk. Principle is the same, CreateFlag now also has a FORCE flag. It compiles and I did test with a simple client. I have not managed to get HBase running with Hadoop trunk, yet.
          Hide
          Lars Hofhansl added a comment -

          Patch that I tested against HBase.
          (I post the required HBase changes on the linked jira.)

          HBase starts up, I can flush, and compact tables.
          I verified via debugger that the sync path is correctly triggered.

          Please have a look. For users like us (Salesforce.com) this is an important data safety feature.

          Show
          Lars Hofhansl added a comment - Patch that I tested against HBase. (I post the required HBase changes on the linked jira.) HBase starts up, I can flush, and compact tables. I verified via debugger that the sync path is correctly triggered. Please have a look. For users like us (Salesforce.com) this is an important data safety feature.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Let's try submitting it. Will review the patch next week.

          Show
          Tsz Wo Nicholas Sze added a comment - Let's try submitting it. Will review the patch next week.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12526451/HDFS-744-trunk-v2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings).

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.TestFilterFileSystem
          org.apache.hadoop.fs.viewfs.TestViewFsTrash
          org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery
          org.apache.hadoop.hdfs.TestMultiThreadedHflush
          org.apache.hadoop.hdfs.TestRenameWhileOpen
          org.apache.hadoop.hdfs.TestWriteRead
          org.apache.hadoop.hdfs.TestFileAppend3
          org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart
          org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS
          org.apache.hadoop.hdfs.TestPipelines

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526451/HDFS-744-trunk-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings). +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestFilterFileSystem org.apache.hadoop.fs.viewfs.TestViewFsTrash org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery org.apache.hadoop.hdfs.TestMultiThreadedHflush org.apache.hadoop.hdfs.TestRenameWhileOpen org.apache.hadoop.hdfs.TestWriteRead org.apache.hadoop.hdfs.TestFileAppend3 org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS org.apache.hadoop.hdfs.TestPipelines +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2427//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Looking at the test failures.

          Show
          Lars Hofhansl added a comment - Looking at the test failures.
          Hide
          Lars Hofhansl added a comment -

          Most tests fail due to an NPE in flush.
          currentPacket can be null in DFSOutputStream.

          {hflush|hsync}

          even when bytesCurBlock > lastFlushOffset, which I found a bit surprising.
          I think in that case it is OK not to send a sync packet (if FORCE is enabled, because there is nothing outstanding to sync).

          Also fixed TestFilterFileSystem.
          Not sure what's wrong with testTrash.

          Also I could use guidance for:

          • what types of tests should I add
          • If I wanted to add a feature where optionally only the first DN in the replication chain does the fsync, how should a client convey that to the FS.create method (it does not make sense that add another CreateFlag).
          Show
          Lars Hofhansl added a comment - Most tests fail due to an NPE in flush. currentPacket can be null in DFSOutputStream. {hflush|hsync} even when bytesCurBlock > lastFlushOffset, which I found a bit surprising. I think in that case it is OK not to send a sync packet (if FORCE is enabled, because there is nothing outstanding to sync). Also fixed TestFilterFileSystem. Not sure what's wrong with testTrash. Also I could use guidance for: what types of tests should I add If I wanted to add a feature where optionally only the first DN in the replication chain does the fsync, how should a client convey that to the FS.create method (it does not make sense that add another CreateFlag).
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12526608/HDFS-744-trunk-v3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings).

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.TestFilterFileSystem
          org.apache.hadoop.fs.viewfs.TestViewFsTrash

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526608/HDFS-744-trunk-v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings). +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestFilterFileSystem org.apache.hadoop.fs.viewfs.TestViewFsTrash +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2428//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Really fix TestFilterFileSystem this time.
          The TestTrash failure seems unrelated to my change.

          Show
          Lars Hofhansl added a comment - Really fix TestFilterFileSystem this time. The TestTrash failure seems unrelated to my change.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12526642/HDFS-744-trunk-v4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 2 new or modified test files.

          -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings).

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526642/HDFS-744-trunk-v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. -1 javac. The applied patch generated 1934 javac compiler warnings (more than the trunk's current 1933 warnings). +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2430//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          In HBASE-5954 I added some performance tests on HBase with WAL and HFile enabled/disable resp.
          Good news that HBase runs fine with the HDFS and FORCE enable.

          Show
          Lars Hofhansl added a comment - In HBASE-5954 I added some performance tests on HBase with WAL and HFile enabled/disable resp. Good news that HBase runs fine with the HDFS and FORCE enable.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Thanks a lot, Lars! The patch looks good. Some comments:

          • Let's name the new CreateFlag as SYNC_BLOCK instead of FORCE. POSIX uses SYNC as you mentioned but POSIX SYNC means syncing every write.
          • DFSOutputStream.hsync(),
            • It should call flush(true). It is better to sync the current block then not syncing at all.
            • Need to update the javadoc to say that it only sync the current block.
          • Rename flush(force) to flushOrSync(isSync) in BlockReceiver and DFSOutputStream. Please also update the javadoc.
          • We do not use tabs in Hadoop. Indentation should use two spaces.
          • Please add some new tests. It is not easy to test whether sync actually works but at least add some new test to call hsync(). See TestFileAppend and TestFileAppend[234] to get some ideas.
          Show
          Tsz Wo Nicholas Sze added a comment - Thanks a lot, Lars! The patch looks good. Some comments: Let's name the new CreateFlag as SYNC_BLOCK instead of FORCE. POSIX uses SYNC as you mentioned but POSIX SYNC means syncing every write. DFSOutputStream.hsync(), It should call flush(true). It is better to sync the current block then not syncing at all. Need to update the javadoc to say that it only sync the current block. Rename flush(force) to flushOrSync(isSync) in BlockReceiver and DFSOutputStream. Please also update the javadoc. We do not use tabs in Hadoop. Indentation should use two spaces. Please add some new tests. It is not easy to test whether sync actually works but at least add some new test to call hsync(). See TestFileAppend and TestFileAppend [234] to get some ideas.
          Hide
          Lars Hofhansl added a comment -

          Thanks Nicholas!

          I'll rename the flag and the flush method.

          The reason for not calling flush(true) was two-fold:

          1. Code that currently uses hsync would suddenly get the new behavior. For example HBase which uses this via a SequenceFile.Writer would have no option to disable this (unless we expose a new flag to Writer.syncFS).
          2. Without SYNC_BLOCK it kinda makes no sense (or would at least set the wrong expectation that everything is sync'ed to disk).

          Sorry about the tabs, I had used my default eclipse formatter.
          Will look at the Append tests and some new ones for hsync.

          Show
          Lars Hofhansl added a comment - Thanks Nicholas! I'll rename the flag and the flush method. The reason for not calling flush(true) was two-fold: Code that currently uses hsync would suddenly get the new behavior. For example HBase which uses this via a SequenceFile.Writer would have no option to disable this (unless we expose a new flag to Writer.syncFS). Without SYNC_BLOCK it kinda makes no sense (or would at least set the wrong expectation that everything is sync'ed to disk). Sorry about the tabs, I had used my default eclipse formatter. Will look at the Append tests and some new ones for hsync.
          Hide
          Lars Hofhansl added a comment -

          Alternatively we could add flushFS() to SequenceFile.Writer.

          Show
          Lars Hofhansl added a comment - Alternatively we could add flushFS() to SequenceFile.Writer.
          Hide
          Lars Hofhansl added a comment -

          New version of the patch.

          • Implemented all of Nicholas suggestions.
          • Added some simple tests.
          • Added a flushFS() method to SequeceFile.Writer.

          I would still prefer to implement to hsync() as flushOrSync(syncBlock) rather than flushOrSync(true), but this works too.

          Show
          Lars Hofhansl added a comment - New version of the patch. Implemented all of Nicholas suggestions. Added some simple tests. Added a flushFS() method to SequeceFile.Writer. I would still prefer to implement to hsync() as flushOrSync(syncBlock) rather than flushOrSync(true), but this works too.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528002/HDFS-744-trunk-v5.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified test files.

          -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings).

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.viewfs.TestViewFsTrash

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528002/HDFS-744-trunk-v5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings). -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.viewfs.TestViewFsTrash +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2470//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          The new javac warning is

          [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java:[260,30] [deprecation] createNonRecursive(org.apache.hadoop.fs.Path,org.apache.hadoop.fs.permission.FsPermission,java.util.EnumSet<org.apache.hadoop.fs.CreateFlag>,int,short,long,org.apache.hadoop.util.Progressable) in org.apache.hadoop.fs.FileSystem has been deprecated

          So, it is okay.

          This is a significant change. I think the patch is quite close. I will take a look one more time in details. If you wants to review the patch, please do so.

          Show
          Tsz Wo Nicholas Sze added a comment - The new javac warning is [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java: [260,30] [deprecation] createNonRecursive(org.apache.hadoop.fs.Path,org.apache.hadoop.fs.permission.FsPermission,java.util.EnumSet<org.apache.hadoop.fs.CreateFlag>,int,short,long,org.apache.hadoop.util.Progressable) in org.apache.hadoop.fs.FileSystem has been deprecated So, it is okay. This is a significant change. I think the patch is quite close. I will take a look one more time in details. If you wants to review the patch, please do so.
          Hide
          Todd Lipcon added a comment -

          Quickly looked at the patch. A few notes:

          • I don't think it's a good idea to change SequenceFile.syncFs() to actually call through to hsync(), given how much more expensive it is. This would be a "performance-incompatible change". As much as it sucks, maybe we need to deprecate syncFs() and add a new method, instead – eg one that takes a boolean or an enum explaining what level of sync is needed.
          • For the javadoc on SYNC_BLOCK, should explain that, though it's similar to O_SYNC, it differs in that it's only on close that it is synced, and recommend that the user call hsync() explicitly after each write if true synchronous behavior is required
          • TestHSync is missing license header
          • Metrics: we should have some metrics on number of hsyncs performed at each DN, as well as the time spent in FileChannel.force. The functional tests could then verify these metrics are actually incremented after the hsync calls where expected
          • It seems wrong that the syncBlock flag is on the packets themselves. Why do we need this flag? Why not just have the client (or server) keep a flag which gets set whenever hsync() has been called. Then, when the client sends "last packet in block", the sync flag also gets set?
          • Can you explain this logic to me?
            +      if (syncPacket && !(syncBlock && lastPacketInBlock)) {
            +        flushOrSync(true);
            +      }
            

            Is that to avoid the cost of a "double" sync on close? If so, can you add a comment as much?

          • +  required bool syncBlock = 5;
            +  required bool syncPacket = 6;
            

            These flags should be optional with a default of false, so that we don't break client-server compatibility with 2.0.0-alpha.

          Show
          Todd Lipcon added a comment - Quickly looked at the patch. A few notes: I don't think it's a good idea to change SequenceFile.syncFs() to actually call through to hsync(), given how much more expensive it is. This would be a "performance-incompatible change". As much as it sucks, maybe we need to deprecate syncFs() and add a new method, instead – eg one that takes a boolean or an enum explaining what level of sync is needed. For the javadoc on SYNC_BLOCK, should explain that, though it's similar to O_SYNC, it differs in that it's only on close that it is synced, and recommend that the user call hsync() explicitly after each write if true synchronous behavior is required TestHSync is missing license header Metrics: we should have some metrics on number of hsyncs performed at each DN, as well as the time spent in FileChannel.force. The functional tests could then verify these metrics are actually incremented after the hsync calls where expected It seems wrong that the syncBlock flag is on the packets themselves. Why do we need this flag? Why not just have the client (or server) keep a flag which gets set whenever hsync() has been called. Then, when the client sends "last packet in block", the sync flag also gets set? Can you explain this logic to me? + if (syncPacket && !(syncBlock && lastPacketInBlock)) { + flushOrSync( true ); + } Is that to avoid the cost of a "double" sync on close? If so, can you add a comment as much? + required bool syncBlock = 5; + required bool syncPacket = 6; These flags should be optional with a default of false, so that we don't break client-server compatibility with 2.0.0-alpha.
          Hide
          Lars Hofhansl added a comment -

          Thanks Todd and Nicholas.

          • To your first point, this is why I initially had hsync only actually fsync if the stream was created accordingly. So it's only a performance change if the stream was consciously created with SYNC_BLOCK. Nicholas had suggested that hsync should always sync. I'm also happy to add a syncFS(boolean sync) to SequenceFile.Writer.
          • Re: sending the flag along with the last packet. I was seeing this more as a set of flags the describe the stream to the DN (like a file descriptor). For this specific flag it would be ok to only send it with the last packet of a block, for other (future) flags that might not be the case. Do you feel strongly about this? Happy to change it, if you do.
          • The logic is indeed to avoid double sync. Specifically this is the case where an empty sync-packet was sent to trigger a sync on the DN. Will add a comment.
          • Will make these flags optional with default of false.

          I guess the main part to decide: Should DFSOutputStream.hsync() only sync if the stream was created with SYNC_BLOCK? If not, should we deprecate SequenceFile.Writer.syncFS() and add SequenceFile.Writer.syncFS(boolean)?

          Show
          Lars Hofhansl added a comment - Thanks Todd and Nicholas. To your first point, this is why I initially had hsync only actually fsync if the stream was created accordingly. So it's only a performance change if the stream was consciously created with SYNC_BLOCK. Nicholas had suggested that hsync should always sync. I'm also happy to add a syncFS(boolean sync) to SequenceFile.Writer. Re: sending the flag along with the last packet. I was seeing this more as a set of flags the describe the stream to the DN (like a file descriptor). For this specific flag it would be ok to only send it with the last packet of a block, for other (future) flags that might not be the case. Do you feel strongly about this? Happy to change it, if you do. The logic is indeed to avoid double sync. Specifically this is the case where an empty sync-packet was sent to trigger a sync on the DN. Will add a comment. Will make these flags optional with default of false. I guess the main part to decide: Should DFSOutputStream.hsync() only sync if the stream was created with SYNC_BLOCK? If not, should we deprecate SequenceFile.Writer.syncFS() and add SequenceFile.Writer.syncFS(boolean)?
          Hide
          Lars Hofhansl added a comment -

          Oh, and I'll add the metrics too.

          Show
          Lars Hofhansl added a comment - Oh, and I'll add the metrics too.
          Hide
          Todd Lipcon added a comment -

          Here's my opinion: let's create a syncFs(SyncType type) and an {{enum SyncType

          { HFLUSH, HSYNC }

          ; }}. Then, should we add a new type of sync (eg HSYNC_ON_ONLY_ONE_NODE) we can do it compatibly without an explosion of APIs.

          Re: sending the flag along with the last packet. I was seeing this more as a set of flags the describe the stream to the DN (like a file descriptor). For this specific flag it would be ok to only send it with the last packet of a block, for other (future) flags that might not be the case. Do you feel strongly about this? Happy to change it, if you do

          If the flags are with the packet, then I think they should explicitly describe what to do with that packet – or cause a specific action at the time the packet is received. The strange thing about SYNC_BLOCK is that it's not a command, but rather an indicator of some action to do in the future. So I think it makes sense to either attach it to the empty end-of-block (indicating the action to take at that point), or make it implicit that if you ever hsync() at any point in a block, then it will also hsync() the close of the block. Thoughts?

          Show
          Todd Lipcon added a comment - Here's my opinion: let's create a syncFs(SyncType type) and an {{enum SyncType { HFLUSH, HSYNC } ; }}. Then, should we add a new type of sync (eg HSYNC_ON_ONLY_ONE_NODE ) we can do it compatibly without an explosion of APIs. Re: sending the flag along with the last packet. I was seeing this more as a set of flags the describe the stream to the DN (like a file descriptor). For this specific flag it would be ok to only send it with the last packet of a block, for other (future) flags that might not be the case. Do you feel strongly about this? Happy to change it, if you do If the flags are with the packet, then I think they should explicitly describe what to do with that packet – or cause a specific action at the time the packet is received. The strange thing about SYNC_BLOCK is that it's not a command, but rather an indicator of some action to do in the future. So I think it makes sense to either attach it to the empty end-of-block (indicating the action to take at that point), or make it implicit that if you ever hsync() at any point in a block, then it will also hsync() the close of the block. Thoughts?
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Todd, you make a good point that we should not change SequenceFile.syncFs(). Let's

          • deprecate SequenceFile.Writer.syncFs();
          • add SequenceFile.Writer.hsync() and SequenceFile.Writer.hflush();
          • change SequenceFile.Writer to implements org.apache.hadoop.fs.Syncable.

          I am fine to combine syncBlock and sycPacket if it is possible.

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Todd, you make a good point that we should not change SequenceFile.syncFs(). Let's deprecate SequenceFile.Writer.syncFs(); add SequenceFile.Writer.hsync() and SequenceFile.Writer.hflush(); change SequenceFile.Writer to implements org.apache.hadoop.fs.Syncable. I am fine to combine syncBlock and sycPacket if it is possible.
          Hide
          Lars Hofhansl added a comment -

          I like SequenceFile.Writer implementing Syncable.

          I see the argument for just a single flag. Yes, that would simpler. Just have a sync-flag, the client decides when to send it (either upon calling hsync, or with the last packet in the block if the stream was created with SYNC_BLOCK).

          Show
          Lars Hofhansl added a comment - I like SequenceFile.Writer implementing Syncable. I see the argument for just a single flag. Yes, that would simpler. Just have a sync-flag, the client decides when to send it (either upon calling hsync, or with the last packet in the block if the stream was created with SYNC_BLOCK).
          Hide
          Lars Hofhansl added a comment -

          New patch.

          • only uses one flag
          • SequenceFile.Writer implement Syncable.
          • syncFS is deprecated.
          • added metrics
          • added test using metrics to verify behavior
          • added replication test, to make sure the hsync is happening correctly on replicas.
          • found a bug, where sometimes a sync is missed, happens when the currentPacket in hsync is null. Fixed now.

          Currently I count syncs of "out" and "checksumOut" (in BlockReceiver) as two sync events. Would be hard to do otherwise, as both can be null resp, in which case no sync is happening.

          In rare cases I found that the metrics on the DNs are not updated fast enough, and a test fails. Looks like there are existing tests that also use metrics that have the same problem. Not sure what to do about that.

          Show
          Lars Hofhansl added a comment - New patch. only uses one flag SequenceFile.Writer implement Syncable. syncFS is deprecated. added metrics added test using metrics to verify behavior added replication test, to make sure the hsync is happening correctly on replicas. found a bug, where sometimes a sync is missed, happens when the currentPacket in hsync is null. Fixed now. Currently I count syncs of "out" and "checksumOut" (in BlockReceiver) as two sync events. Would be hard to do otherwise, as both can be null resp, in which case no sync is happening. In rare cases I found that the metrics on the DNs are not updated fast enough, and a test fails. Looks like there are existing tests that also use metrics that have the same problem. Not sure what to do about that.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528215/HDFS-744-trunk-v6.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings).

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.viewfs.TestViewFsTrash

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528215/HDFS-744-trunk-v6.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings). -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.viewfs.TestViewFsTrash +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2489//console This message is automatically generated.
          Hide
          Lars Hofhansl added a comment -

          Upon further inspection it turns out the issue with the metrics not being updated in time is due to the fact the flushOrSync is not on the synchronous hsync or hflush path in BlockReceiver.receivePacket.
          When hsync returns the DNs have all gotten the data and the request to flush and sync, but they are not necessarily finished with it.

          The same, BTW, is currently (without my patch) true to hflush. When hflush returns the DNs have received the data in their buffers, but they may not be done flushing the data to the OS buffers.

          If we wanted to fix this (which would involve doing the flushOrSync in the PacketResponder if the request is a client request, which inconveniently would also serialize the flush/syncs across replicas), this should be a different jira, I think.

          For now, to make the tests a bit more predictable I'll increment the sync counter before the actual sync operation rather than afterwards.

          Show
          Lars Hofhansl added a comment - Upon further inspection it turns out the issue with the metrics not being updated in time is due to the fact the flushOrSync is not on the synchronous hsync or hflush path in BlockReceiver.receivePacket. When hsync returns the DNs have all gotten the data and the request to flush and sync, but they are not necessarily finished with it. The same, BTW, is currently (without my patch) true to hflush. When hflush returns the DNs have received the data in their buffers, but they may not be done flushing the data to the OS buffers. If we wanted to fix this (which would involve doing the flushOrSync in the PacketResponder if the request is a client request, which inconveniently would also serialize the flush/syncs across replicas), this should be a different jira, I think. For now, to make the tests a bit more predictable I'll increment the sync counter before the actual sync operation rather than afterwards.
          Hide
          Lars Hofhansl added a comment -

          v7 is almost identical to v6.

          • Count a sync only once, but still measure combined time for syncing out and checksumOut
          Show
          Lars Hofhansl added a comment - v7 is almost identical to v6. Count a sync only once, but still measure combined time for syncing out and checksumOut
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528275/HDFS-744-trunk-v7.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings).

          -1 javadoc. The javadoc tool appears to have generated 2 warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.viewfs.TestViewFsTrash

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528275/HDFS-744-trunk-v7.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. -1 javac. The applied patch generated 1973 javac compiler warnings (more than the trunk's current 1972 warnings). -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.viewfs.TestViewFsTrash +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2490//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Looks pretty good! Just a few small things:

          • looks like hard tabs still snuck in one spot
          • I don't think hsync() needs to be synchronized: ie further writes could proceed to the stream while an earlier writer is waiting on a sync response

              // repeated hsyncs, not avoiding unnecessary sync here
              out.hsync();
              checkSyncMetric(cluster, 2);
          

          maybe add a note that says more explicitly something like: this is a potential future optimization that is not yet implemented, or something?

          Show
          Todd Lipcon added a comment - Looks pretty good! Just a few small things: looks like hard tabs still snuck in one spot I don't think hsync() needs to be synchronized: ie further writes could proceed to the stream while an earlier writer is waiting on a sync response // repeated hsyncs, not avoiding unnecessary sync here out.hsync(); checkSyncMetric(cluster, 2); maybe add a note that says more explicitly something like: this is a potential future optimization that is not yet implemented, or something?
          Hide
          Lars Hofhansl added a comment -
          • The synchronized was left over from the former hsync stuff (the one that just called hflush). I agree it should not be needed.
          • Removed the tabs
          • Added a better comment about multiple hsyncs in the test code
          • Removed some unused variables from the test code
          Show
          Lars Hofhansl added a comment - The synchronized was left over from the former hsync stuff (the one that just called hflush). I agree it should not be needed. Removed the tabs Added a better comment about multiple hsyncs in the test code Removed some unused variables from the test code
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12528804/HDFS-744-trunk-v8.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified test files.

          -1 javac. The applied patch generated 1972 javac compiler warnings (more than the trunk's current 1971 warnings).

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//testReport/
          Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528804/HDFS-744-trunk-v8.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. -1 javac. The applied patch generated 1972 javac compiler warnings (more than the trunk's current 1971 warnings). +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2511//console This message is automatically generated.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          +1 patch looks good.

          Todd, do you want to take a look the latest patch? I will commit it if you don't have further comments.

          Show
          Tsz Wo Nicholas Sze added a comment - +1 patch looks good. Todd, do you want to take a look the latest patch? I will commit it if you don't have further comments.
          Hide
          Lars Hofhansl added a comment -

          @Todd: Could you have a quick look? There are possible improvements to this (sync on one DN only, sync on one DN per rack, for many replicas could sync in a tree-like fashion, etc, etc), but these would be best tackled in a separate jira.

          This overlaps (partially) with HDFS-1783, so I'd like to wait with that until this is committed.

          Show
          Lars Hofhansl added a comment - @Todd: Could you have a quick look? There are possible improvements to this (sync on one DN only, sync on one DN per rack, for many replicas could sync in a tree-like fashion, etc, etc), but these would be best tackled in a separate jira. This overlaps (partially) with HDFS-1783 , so I'd like to wait with that until this is committed.
          Hide
          Eli Collins added a comment -

          @Lars, Todd is on vacation this week, might take him some time to get back to you.

          Show
          Eli Collins added a comment - @Lars, Todd is on vacation this week, might take him some time to get back to you.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          If Todd is not around, then I will commit the patch so that it won't block Lars' work.

          Show
          Tsz Wo Nicholas Sze added a comment - If Todd is not around, then I will commit the patch so that it won't block Lars' work.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          I have committed this. Thanks, Lars!

          Show
          Tsz Wo Nicholas Sze added a comment - I have committed this. Thanks, Lars!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2302 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2302/)
          HDFS-744. Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2302 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2302/ ) HDFS-744 . Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2375 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2375/)
          HDFS-744. Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2375 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2375/ ) HDFS-744 . Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Hide
          Lars Hofhansl added a comment -

          Thanks Nicholas!

          Show
          Lars Hofhansl added a comment - Thanks Nicholas!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2320 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2320/)
          HDFS-744. Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419)

          Result = FAILURE
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2320 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2320/ ) HDFS-744 . Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Hide
          Ted Yu added a comment -

          Backport to hadoop 1.0 would be nice.

          Show
          Ted Yu added a comment - Backport to hadoop 1.0 would be nice.
          Hide
          Lars Hofhansl added a comment -

          I agree. Or at least Hadoop 2.0.x.

          Show
          Lars Hofhansl added a comment - I agree. Or at least Hadoop 2.0.x.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          For backporting it to 1.0, it requires a completely different patch.

          For merging it to 2.0.x, the 2.0.0-alpha release is already out and we are not supposed to add new features to a minor release. We should first send out an email to the hdfs-dev list to check with other people. I beg no one would object it. Lars, do you want to send the email or I could do it?

          Show
          Tsz Wo Nicholas Sze added a comment - For backporting it to 1.0, it requires a completely different patch. For merging it to 2.0.x, the 2.0.0-alpha release is already out and we are not supposed to add new features to a minor release. We should first send out an email to the hdfs-dev list to check with other people. I beg no one would object it. Lars, do you want to send the email or I could do it?
          Hide
          Eli Collins added a comment -

          We're still not out of alpha for 2.0, IMO this seems reasonable to merge for the 2.0.1-alpha release.

          Show
          Eli Collins added a comment - We're still not out of alpha for 2.0, IMO this seems reasonable to merge for the 2.0.1-alpha release.
          Hide
          Lars Hofhansl added a comment -

          The principle change for 1.0 is the same, but the FileSystem API is significantly different.
          I'll update the patch I had already attached here.
          I'll also do a backport to 2.0. Hopefully there aren't many differences.

          Show
          Lars Hofhansl added a comment - The principle change for 1.0 is the same, but the FileSystem API is significantly different. I'll update the patch I had already attached here. I'll also do a backport to 2.0. Hopefully there aren't many differences.
          Hide
          Lars Hofhansl added a comment -

          This patch mostly applied to the 2.0 branch (with some offsets, no fuzz, and one hunk failed, which was easily fixable).

          I understand if there's hesitation to pull this into 2.0 or even 1.0.

          Show
          Lars Hofhansl added a comment - This patch mostly applied to the 2.0 branch (with some offsets, no fuzz, and one hunk failed, which was easily fixable). I understand if there's hesitation to pull this into 2.0 or even 1.0.
          Hide
          Lars Hofhansl added a comment -

          Meh, forgot to add TestHSync to svn.

          Show
          Lars Hofhansl added a comment - Meh, forgot to add TestHSync to svn.
          Hide
          Lars Hofhansl added a comment -

          I'm happy to send an email to DEV list, BTW.

          Show
          Lars Hofhansl added a comment - I'm happy to send an email to DEV list, BTW.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1062 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1062/)
          HDFS-744. Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1062 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1062/ ) HDFS-744 . Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1096 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1096/)
          HDFS-744. Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419)

          Result = SUCCESS
          szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419
          Files :

          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1096 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1096/ ) HDFS-744 . Support hsync in HDFS. Contributed by Lars Hofhans (Revision 1344419) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1344419 Files : /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFilterFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PacketHeader.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetrics.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestHSync.java
          Hide
          Lars Hofhansl added a comment -

          Two responses on DEV list on port to 2.0, both in favor.
          If there are no objections, should I open a new jira, or should we just do it here?
          The patch is identical (minus a single hunk that needed a small change to apply).

          Let's punt on 1.0, the API is too different.

          Show
          Lars Hofhansl added a comment - Two responses on DEV list on port to 2.0, both in favor. If there are no objections, should I open a new jira, or should we just do it here? The patch is identical (minus a single hunk that needed a small change to apply). Let's punt on 1.0, the API is too different.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          Hi Lars, I just have merged this to branch-2. "svn merge" did the job so we don't need a 2.0 patch. Thanks a lot!

          Show
          Tsz Wo Nicholas Sze added a comment - Hi Lars, I just have merged this to branch-2. "svn merge" did the job so we don't need a 2.0 patch. Thanks a lot!
          Hide
          Lars Hofhansl added a comment -

          Just noticed (because of HDFS-3721) that this is now in 2.1.0-alpha.
          Any chance to get this into 2.0.x-alpha? (In fact on June 4th is was marked with 2.0.1-alpha, but something changed since then, so now it's 2.1.0-alpha)

          Show
          Lars Hofhansl added a comment - Just noticed (because of HDFS-3721 ) that this is now in 2.1.0-alpha. Any chance to get this into 2.0.x-alpha? (In fact on June 4th is was marked with 2.0.1-alpha, but something changed since then, so now it's 2.1.0-alpha)
          Hide
          Todd Lipcon added a comment -

          2.0.x is supposed to be for critical bug fixes and security issues, not new features. So I dont think this should go in 2.0.2

          Show
          Todd Lipcon added a comment - 2.0.x is supposed to be for critical bug fixes and security issues, not new features. So I dont think this should go in 2.0.2
          Hide
          Luke Lu added a comment -

          I think that it's important enough to clarify:

          When hsync returns the DNs have all gotten the data and the request to flush and sync, but they are not necessarily finished with it.

          It's NOT the "proper" fsync semantics then. In fact, this is even worse than hflush in branch-1, which acks after data is flushed to OS cache (written to FileOutputStream). i.e., HBase with Hadoop 1.x can survive kill -9 all DNs at the same time without data loss, while this hsync implementation cannot.

          The same, BTW, is currently (without my patch) true to hflush. When hflush returns the DNs have received the data in their buffers, but they may not be done flushing the data to the OS buffers.

          This is an unfortunate side-effect of HDS-265 that makes branch-1 hflush more durable (which is not enough) than hsync/hflush in trunk (and all post 0.21 code).

          If we wanted to fix this (which would involve doing the flushOrSync in the PacketResponder if the request is a client request, which inconveniently would also serialize the flush/syncs across replicas), this should be a different jira, I think.

          Did you file a JIRA Lars? BTW, instead of doing the sync in the responder (ack thread), an alternative implementation could use a conditional variable to signal sync done in the data thread..., which I think, is a small price to pay for a parallel sync with the correct hsync semantics.

          Show
          Luke Lu added a comment - I think that it's important enough to clarify: When hsync returns the DNs have all gotten the data and the request to flush and sync, but they are not necessarily finished with it. It's NOT the "proper" fsync semantics then. In fact, this is even worse than hflush in branch-1, which acks after data is flushed to OS cache (written to FileOutputStream). i.e., HBase with Hadoop 1.x can survive kill -9 all DNs at the same time without data loss, while this hsync implementation cannot. The same, BTW, is currently (without my patch) true to hflush. When hflush returns the DNs have received the data in their buffers, but they may not be done flushing the data to the OS buffers. This is an unfortunate side-effect of HDS-265 that makes branch-1 hflush more durable (which is not enough) than hsync/hflush in trunk (and all post 0.21 code). If we wanted to fix this (which would involve doing the flushOrSync in the PacketResponder if the request is a client request, which inconveniently would also serialize the flush/syncs across replicas), this should be a different jira, I think. Did you file a JIRA Lars? BTW, instead of doing the sync in the responder (ack thread), an alternative implementation could use a conditional variable to signal sync done in the data thread..., which I think, is a small price to pay for a parallel sync with the correct hsync semantics.
          Hide
          Lars Hofhansl added a comment -

          You are right Luke. I implemented this in the context of hadoop-2 (i.e. with HDFS-265).
          It seems to get this right HDFS-265 needs to be revisited again.

          Will look at your suggestion (doing sync in the data thread). As long as the syncs (or flushes) are not serialized it's fine (otherwise nobody is going to switch this on).

          Show
          Lars Hofhansl added a comment - You are right Luke. I implemented this in the context of hadoop-2 (i.e. with HDFS-265 ). It seems to get this right HDFS-265 needs to be revisited again. Will look at your suggestion (doing sync in the data thread). As long as the syncs (or flushes) are not serialized it's fine (otherwise nobody is going to switch this on).
          Hide
          Lars Hofhansl added a comment -

          In any case, hsync and hflush should be fixed together. A one-off for hsync does not seem to be the right thing.

          Show
          Lars Hofhansl added a comment - In any case, hsync and hflush should be fixed together. A one-off for hsync does not seem to be the right thing.
          Hide
          Kan Zhang added a comment -

          As long as the syncs (or flushes) are not serialized it's fine (otherwise nobody is going to switch this on).

          I must be missing something here, but why can't we move the enqueuing of seqno in receivePacket() back to after flushOrSync() is done, essentially reverting the change made by HDFS-265? In doing so, hflush will be implementing API4 instead of API3.

          Show
          Kan Zhang added a comment - As long as the syncs (or flushes) are not serialized it's fine (otherwise nobody is going to switch this on). I must be missing something here, but why can't we move the enqueuing of seqno in receivePacket() back to after flushOrSync() is done, essentially reverting the change made by HDFS-265 ? In doing so, hflush will be implementing API4 instead of API3.
          Hide
          Lars Hofhansl added a comment -

          I think the problem is that we have to enqueue the seqno before the packet is sent downstream right? (Otherwise we could potentially miss the ack, right?)

          So in order to enqueue the seqno after we syncOrFlush, we'd also have to send the packet downstream after we syncOrFlush, which essentially means that we are serializing the sync times across all replicas.

          Show
          Lars Hofhansl added a comment - I think the problem is that we have to enqueue the seqno before the packet is sent downstream right? (Otherwise we could potentially miss the ack, right?) So in order to enqueue the seqno after we syncOrFlush, we'd also have to send the packet downstream after we syncOrFlush, which essentially means that we are serializing the sync times across all replicas.
          Hide
          Lars Hofhansl added a comment -

          Another approach would be to wait in the responder until both the downstream datanode responded and the sync has finished. That way we get correctness and we can still interleave sync'ing/RTT in the pipeline.

          Show
          Lars Hofhansl added a comment - Another approach would be to wait in the responder until both the downstream datanode responded and the sync has finished. That way we get correctness and we can still interleave sync'ing/RTT in the pipeline.
          Hide
          Kan Zhang added a comment -

          I think the problem is that we have to enqueue the seqno before the packet is sent downstream right? (Otherwise we could potentially miss the ack, right?)

          I don't think we have to enqueue the seqno before the packet is sent downstream to avoid missing the ack, since the PacketResponder thread will wait on the ackQueue if it's empty, right?

          1035                  while (running && datanode.shouldRun && ackQueue.size() == 0) {
          1036	                    if (LOG.isDebugEnabled()) {
          1037	                      LOG.debug(myString + ": seqno=" + seqno +
          1038	                                " waiting for local datanode to finish write.");
          1039	                    }
          1040	                    wait();
          1041	                  }
          
          Show
          Kan Zhang added a comment - I think the problem is that we have to enqueue the seqno before the packet is sent downstream right? (Otherwise we could potentially miss the ack, right?) I don't think we have to enqueue the seqno before the packet is sent downstream to avoid missing the ack, since the PacketResponder thread will wait on the ackQueue if it's empty, right? 1035 while (running && datanode.shouldRun && ackQueue.size() == 0) { 1036 if (LOG.isDebugEnabled()) { 1037 LOG.debug(myString + ": seqno=" + seqno + 1038 " waiting for local datanode to finish write." ); 1039 } 1040 wait(); 1041 }
          Hide
          Lars Hofhansl added a comment -

          I see. In that case we wouldn't ack back until all local work in done.
          A possible place to do that would be almost at the end of receivePacket(), maybe in a finally block of the last try/catch in that method.

          That still does not take of all the cases, though; for the last packet in a block, the sync is deferred to close() (to avoid double sync). That's not hard to to change either, I think.

          Show
          Lars Hofhansl added a comment - I see. In that case we wouldn't ack back until all local work in done. A possible place to do that would be almost at the end of receivePacket(), maybe in a finally block of the last try/catch in that method. That still does not take of all the cases, though; for the last packet in a block, the sync is deferred to close() (to avoid double sync). That's not hard to to change either, I think.
          Hide
          Lars Hofhansl added a comment -

          BTW. I filed HDFS-3979 to do that.

          Show
          Lars Hofhansl added a comment - BTW. I filed HDFS-3979 to do that.

            People

            • Assignee:
              Lars Hofhansl
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              36 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development