Hadoop Common
  1. Hadoop Common
  2. HADOOP-1702

Reduce buffer copies when data is written to DFS

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Reduced buffer copies as data is written to HDFS. The order of sending data bytes and control information has changed, but this will not be observed by client applications.

      Description

      HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.

             (1)                 (2)          (3)                 (5)
         +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror  
         | (buffer)                  (socket)           |  (4)
         |                                              +--||--+
       =====                                                    |
       =====                                                  =====
       (disk)                                                 =====
      

      Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).

      Similarly some copies can be reduced when clients read data from the DFS.

      1. HADOOP-1702.patch
        34 kB
        Raghu Angadi
      2. HADOOP-1702.patch
        39 kB
        Raghu Angadi
      3. HADOOP-1702.patch
        39 kB
        Raghu Angadi
      4. HADOOP-1702.patch
        39 kB
        Raghu Angadi
      5. HADOOP-1702.patch
        39 kB
        Raghu Angadi
      6. HADOOP-1702.patch
        41 kB
        Raghu Angadi
      7. HADOOP-1702.patch
        41 kB
        Raghu Angadi
      8. HADOOP-1702.patch
        42 kB
        Raghu Angadi
      9. HADOOP-1702.patch
        42 kB
        Raghu Angadi

        Issue Links

          Activity

          Raghu Angadi created issue -
          Doug Cutting made changes -
          Field Original Value New Value
          Fix Version/s 0.15.0 [ 12312565 ]
          Konstantin Shvachko made changes -
          Link This issue relates to HADOOP-2154 [ HADOOP-2154 ]
          Raghu Angadi made changes -
          Link This issue depends upon HADOOP-2758 [ HADOOP-2758 ]
          Raghu Angadi made changes -
          Fix Version/s 0.17.0 [ 12312913 ]
          Description
          HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.

          {noformat}
                 (1) (2) (3) (5)
             +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror
             | (buffer) (socket) | (4)
             | +--||--+
           ===== |
           ===== =====
           (disk) =====
          {noformat}

          Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).

          Similarly some copies can be reduced when clients read data from the DFS.
          HADOOP-1649 adds extra buffering to improve write performance. The following diagram shows buffers as pointed by (numbers). Each eatra buffer adds an extra copy since most of our read()/write()s match the io.bytes.per.checksum, which is much smaller than buffer size.

          {noformat}
                 (1) (2) (3) (5)
             +---||----[ CLIENT ]---||----<>-----||---[ DATANODE ]---||--<>-> to Mirror
             | (buffer) (socket) | (4)
             | +--||--+
           ===== |
           ===== =====
           (disk) =====
          {noformat}

          Currently loops that read and write block data, handle one checksum chunk at a time. By reading multiple chunks at a time, we can remove buffers (1), (2), (3), and (5).

          Similarly some copies can be reduced when clients read data from the DFS.
          Hide
          Raghu Angadi added a comment -

          The protocol and code has changed quite a bit around write path, but memory copies essentially remain the same as in the picture above.

          Show
          Raghu Angadi added a comment - The protocol and code has changed quite a bit around write path, but memory copies essentially remain the same as in the picture above.
          Hide
          Raghu Angadi added a comment - - edited

          The attached patch reduces buffer copies on DataNode. DataNode read one
          packet and writes it to mirror and the local disk. Each packet contains
          io.file.buffer.size of data.

          Below: 'b.i.s' is BufferedInputStream and 'b.o.s' is BufferedOutputStream, and these are
          are much larger than typical size of read or write to them.

          Each ---> represents a memory copy.

          • Datanode :
            • before :
                                                                   + ---> b.o.s ---> mirror socket
                                                                   |
                client socket ---> b.i.s ---> small DataNode buf --|
                                                                   |
                                                                   + ---> b.o.s ---> local disk 
                
            • after :
                                                                 + ---> mirror socket
                                                                 |
                client socket ---> Large Datanode buf (Packet) --|
                                                                 |
                                                                 + ---> local disk 
                
          • Client :
            • before: Client used 64k packets irrespective of io.file.buffer.size.
              So the extra copy for b.o.s was not present if io.file.buffer.size at or
              below 64k. But each packet required 2 writes.
            • after : the size of packet is based on io.file.buffer.size and we use one
              write to write to datanode socket.

          I don't have numbers regd cpu savings. In absolute numbers, for given amount of data on one DataNode, CPU saved on DataNode should be larger than CPU saved when same amount is read with HADOOP-2758.

          DFSIO benchmark numbers have been very sensitive to buffering (not to CPU) while writing, we need to show this patch does not negatively affect this benchmark.

          Show
          Raghu Angadi added a comment - - edited The attached patch reduces buffer copies on DataNode. DataNode read one packet and writes it to mirror and the local disk. Each packet contains io.file.buffer.size of data. Below: 'b.i.s' is BufferedInputStream and 'b.o.s' is BufferedOutputStream, and these are are much larger than typical size of read or write to them. Each ---> represents a memory copy. Datanode : before : + ---> b.o.s ---> mirror socket | client socket ---> b.i.s ---> small DataNode buf --| | + ---> b.o.s ---> local disk after : + ---> mirror socket | client socket ---> Large Datanode buf (Packet) --| | + ---> local disk Client : before: Client used 64k packets irrespective of io.file.buffer.size. So the extra copy for b.o.s was not present if io.file.buffer.size at or below 64k. But each packet required 2 writes. after : the size of packet is based on io.file.buffer.size and we use one write to write to datanode socket. I don't have numbers regd cpu savings. In absolute numbers, for given amount of data on one DataNode, CPU saved on DataNode should be larger than CPU saved when same amount is read with HADOOP-2758 . DFSIO benchmark numbers have been very sensitive to buffering (not to CPU) while writing, we need to show this patch does not negatively affect this benchmark.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12376288 ]
          Hide
          Raghu Angadi added a comment -

          The attached patch applies after applying the patch from HADOOP-2758.

          Show
          Raghu Angadi added a comment - The attached patch applies after applying the patch from HADOOP-2758 .
          Hide
          Raghu Angadi added a comment - - edited

          Test results show 30% improvement in DataNode CPU with the patch. I think it makes sense. Based on the picture above before this patch, with replication of 3, the data is copied 6 + 6 + 4 times and with this patch it is 3 + 3 + 2. Each of these datanodes verify CRC. Approximating cost of checksumming to be twice that of a memory copy, we get (8+6)/(14+6) == 70%. If we increase the size of checksum chunk, cost of CRC will go down. It is be 68% with a factor of 1.5 for CRC.

          Test Setup : three instances of 'dd if=/dev/zero 4Gb | hadoop -put - 4Gb'. More importantly, DataNode was modified to write the deta to '/dev/null' instead of the block file. Otherwise I could not isolate the test from disk activity. The cluster has 3 datanodes. The clients, Namenode, and datanodes are all running on the same node. The test was CPU bound.

          CPU measurement : Linux reports a process' cpu in /proc/pid/stat : 14th entry is user cpu and 15th is kernel cpu. I think these are specified in jiffies. Like most things with Linux kernel, these are approximations but reasonably dependable in large numbers. The numbers reported are sum of cpu for each of the three datanodes.

          below: 'u' and 'k' are user and kernel cpu in thousands of jiffies.

          Test Run 1 Run 2 Run 3 Avg Total Cpu Avg Time
          Trunk* 8.60u 2.52k 372s 8.36u 2.48k 368s 8.39u 2.40k 368s 10.95 369s
          Trunk + patch* 5.61u 2.22k 289s 5.38u 2.16k 296s 5.57u 2.25k 289s 7.73 (70%) 291s (79%)

          * : datanodes write data to /dev/null.

          Currently, DFSIO benchmark shows dip in write b/w. I am still looking into it.

          Show
          Raghu Angadi added a comment - - edited Test results show 30% improvement in DataNode CPU with the patch. I think it makes sense. Based on the picture above before this patch, with replication of 3, the data is copied 6 + 6 + 4 times and with this patch it is 3 + 3 + 2. Each of these datanodes verify CRC. Approximating cost of checksumming to be twice that of a memory copy, we get (8+6)/(14+6) == 70%. If we increase the size of checksum chunk, cost of CRC will go down. It is be 68% with a factor of 1.5 for CRC. Test Setup : three instances of 'dd if=/dev/zero 4Gb | hadoop -put - 4Gb'. More importantly, DataNode was modified to write the deta to '/dev/null' instead of the block file. Otherwise I could not isolate the test from disk activity. The cluster has 3 datanodes. The clients, Namenode, and datanodes are all running on the same node. The test was CPU bound. CPU measurement : Linux reports a process' cpu in /proc/pid/stat : 14th entry is user cpu and 15th is kernel cpu. I think these are specified in jiffies. Like most things with Linux kernel, these are approximations but reasonably dependable in large numbers. The numbers reported are sum of cpu for each of the three datanodes. below: 'u' and 'k' are user and kernel cpu in thousands of jiffies. Test Run 1 Run 2 Run 3 Avg Total Cpu Avg Time Trunk* 8.60u 2.52k 372s 8.36u 2.48k 368s 8.39u 2.40k 368s 10.95 369s Trunk + patch* 5.61u 2.22k 289s 5.38u 2.16k 296s 5.57u 2.25k 289s 7.73 (70%) 291s (79%) * : datanodes write data to /dev/null. Currently, DFSIO benchmark shows dip in write b/w. I am still looking into it.
          Robert Chansler made changes -
          Fix Version/s 0.17.0 [ 12312913 ]
          Hide
          Raghu Angadi added a comment -

          The dip in DFSIO benchmark turned out to be because of the fact that DFSIO creates files with a buffersize of 1000000!. The buffersize passed while creating file is passes on to FileSystem implementation (DFSClient in this case). This brings up the question on how an implementation can treat user specified buffersize. Can increasing buffersize (as in this case) reduce performance, i.e. should an implementation allow it?

          This is what happens on trunk:

          • user specified buffesize is effectively does not matter on trunk.
          • Client sends buffers up packets of 64k size and flushes them after the pkt if full. There could at most 10 such packets in the pipeline at a time.. usually much less.
          • DataNodes use io.file.buffer.size for their streams.

          With the patch here :

          • user specified bufferesize sets the packet size.
          • at DataNodes, packet size dictates write size for mirror stream and local file (i.e. it io.file.buffer.size does not matter).
          • The rest is same.

          Another proposal :

          • packetSize = Min( 64k, buffersize );
          • Max # packets in pipeline = Max(buffersize/packetSize, 10)

          '64k' here could be made an configurable (may be "dfs.write.packet.size") so that different 'real' buffer sizes could be used for experimentation.

          How does the above proposal sound?

          Show
          Raghu Angadi added a comment - The dip in DFSIO benchmark turned out to be because of the fact that DFSIO creates files with a buffersize of 1000000!. The buffersize passed while creating file is passes on to FileSystem implementation (DFSClient in this case). This brings up the question on how an implementation can treat user specified buffersize. Can increasing buffersize (as in this case) reduce performance, i.e. should an implementation allow it? This is what happens on trunk: user specified buffesize is effectively does not matter on trunk. Client sends buffers up packets of 64k size and flushes them after the pkt if full. There could at most 10 such packets in the pipeline at a time.. usually much less. DataNodes use io.file.buffer.size for their streams. With the patch here : user specified bufferesize sets the packet size. at DataNodes, packet size dictates write size for mirror stream and local file (i.e. it io.file.buffer.size does not matter). The rest is same. Another proposal : packetSize = Min( 64k, buffersize ); Max # packets in pipeline = Max(buffersize/packetSize, 10) '64k' here could be made an configurable (may be "dfs.write.packet.size") so that different 'real' buffer sizes could be used for experimentation. How does the above proposal sound?
          Hide
          dhruba borthakur added a comment -

          I would vote for keeping the packetSize fixed at 64K and not make it dependent on any user defined configuration parameter.

          Show
          dhruba borthakur added a comment - I would vote for keeping the packetSize fixed at 64K and not make it dependent on any user defined configuration parameter.
          Hide
          Raghu Angadi added a comment -

          Whats wrong with a internal variable with default of 64k? How do we know 64k is the best for all platforms, though its a pretty good value?

          Show
          Raghu Angadi added a comment - Whats wrong with a internal variable with default of 64k? How do we know 64k is the best for all platforms, though its a pretty good value?
          Hide
          dhruba borthakur added a comment -

          We could certainly fetch the value from the conf, my point was not to insert this configuration parameter in the hadoop-defaults.xml file. Do you agree?

          Show
          dhruba borthakur added a comment - We could certainly fetch the value from the conf, my point was not to insert this configuration parameter in the hadoop-defaults.xml file. Do you agree?
          Hide
          Raghu Angadi added a comment -

          Yeah. This won't be in hadoop-defaults.xml. It will be an internal variable.

          Show
          Raghu Angadi added a comment - Yeah. This won't be in hadoop-defaults.xml. It will be an internal variable.
          Raghu Angadi made changes -
          Fix Version/s 0.18.0 [ 12312972 ]
          Hide
          Raghu Angadi added a comment -

          Patch updated for trunk.

          Show
          Raghu Angadi added a comment - Patch updated for trunk.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12380339 ]
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12380397 ]
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12380401 ]
          Hide
          Raghu Angadi added a comment -

          Patch updated for trunk.

          Show
          Raghu Angadi added a comment - Patch updated for trunk.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12380885 ]
          Hide
          Hairong Kuang added a comment -

          A few initial comments:
          1. Once Packet.getBuffer() is called, no more data can be written to packet. This is not obvious to code readers. Better add this restriction to the comment.
          2. packetSize/writePacketSize in DFSClient don't include the size of the packet header. I think it is better to rename them to packetPayloadSize/writePacketPayloadSize.
          3. The packet size guess calculation in DataNode should match the calculation in DFSClient.

          Show
          Hairong Kuang added a comment - A few initial comments: 1. Once Packet.getBuffer() is called, no more data can be written to packet. This is not obvious to code readers. Better add this restriction to the comment. 2. packetSize/writePacketSize in DFSClient don't include the size of the packet header. I think it is better to rename them to packetPayloadSize/writePacketPayloadSize. 3. The packet size guess calculation in DataNode should match the calculation in DFSClient.
          Hide
          Raghu Angadi added a comment -

          Thanks Hairong. Attached patch fixes the above.
          Regd #2, changed packetSize to include the header rather than changing its name.

          Show
          Raghu Angadi added a comment - Thanks Hairong. Attached patch fixes the above. Regd #2, changed packetSize to include the header rather than changing its name.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12381544 ]
          Hide
          Hairong Kuang added a comment -

          A few more comments:
          1. Currently DFSClient moves the checksum down to fill the gap when a partial packet needs to be sent. To avoid the copy, we could instead store the checksums in the reversed order starting from the dataStart.
          2. Datanode does not verify checksums before a packet is sent to the down stream datanode in the pipeline. This is a change from the current trunk's behavior.
          3. In Datanode.BlockReceiver.readnextPacket, it is clearer to change the variable "pktLen" to be payloadLen.
          4. Datanode.BlockReceiver.checksumOut does not need a big buffer. SMALL_BUFFER_SIZE should do.

          Show
          Hairong Kuang added a comment - A few more comments: 1. Currently DFSClient moves the checksum down to fill the gap when a partial packet needs to be sent. To avoid the copy, we could instead store the checksums in the reversed order starting from the dataStart. 2. Datanode does not verify checksums before a packet is sent to the down stream datanode in the pipeline. This is a change from the current trunk's behavior. 3. In Datanode.BlockReceiver.readnextPacket, it is clearer to change the variable "pktLen" to be payloadLen. 4. Datanode.BlockReceiver.checksumOut does not need a big buffer. SMALL_BUFFER_SIZE should do.
          Hide
          Tsz Wo Nicholas Sze added a comment -

          > 1. Currently DFSClient moves the checksum down to fill the gap when a partial packet needs to be sent. To avoid the copy, we could instead store the checksums in the reversed order starting from the dataStart.

          Or, is it good to interleave data and checksums?

          Show
          Tsz Wo Nicholas Sze added a comment - > 1. Currently DFSClient moves the checksum down to fill the gap when a partial packet needs to be sent. To avoid the copy, we could instead store the checksums in the reversed order starting from the dataStart. Or, is it good to interleave data and checksums?
          Hide
          Raghu Angadi added a comment -

          Thanks for the review Hairong.

          • #1 : interesting suggestion. It would be a change in protocol that affects other datanode transfers like reading etc. We rarely send partial packets (mainly fsync). Also checksum data is less than one percent of the packet. I hope it is ok for this patch. Note that this is not a memory copy that did not exist before (previously all the data was copied).
          • #2: Yes it is a change from previous behavior. Before this patch it didn't matter since we handled 512 bytes at a time. The receiving datanode verifies the checksum anyway. Checking checksum after tunneling data downstream (theoretically) reduces latency. This is the same reason datanode first sends the data to mirror and then stores the data locally.
          • #3. sure.
          • #4. yes.
          Show
          Raghu Angadi added a comment - Thanks for the review Hairong. #1 : interesting suggestion. It would be a change in protocol that affects other datanode transfers like reading etc. We rarely send partial packets (mainly fsync). Also checksum data is less than one percent of the packet. I hope it is ok for this patch. Note that this is not a memory copy that did not exist before (previously all the data was copied). #2: Yes it is a change from previous behavior. Before this patch it didn't matter since we handled 512 bytes at a time. The receiving datanode verifies the checksum anyway. Checking checksum after tunneling data downstream (theoretically) reduces latency. This is the same reason datanode first sends the data to mirror and then stores the data locally. #3. sure. #4. yes.
          Hide
          Raghu Angadi added a comment -

          > Or, is it good to interleave data and checksums?
          . The main purpose of this and some more patches is not to interleave data and checksums.

          Show
          Raghu Angadi added a comment - > Or, is it good to interleave data and checksums? . The main purpose of this and some more patches is not to interleave data and checksums.
          Hide
          Raghu Angadi added a comment -

          Updated patch has changes related to Hairong's review.

          Show
          Raghu Angadi added a comment - Updated patch has changes related to Hairong's review.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12381708 ]
          Raghu Angadi made changes -
          Hadoop Flags [Reviewed]
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Raghu Angadi added a comment -

          DATA_TRANSFER_VERSION is incremented.

          Show
          Raghu Angadi added a comment - DATA_TRANSFER_VERSION is incremented.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12381710 ]
          Raghu Angadi made changes -
          Hadoop Flags [Reviewed] [Incompatible change, Reviewed]
          Release Note Reduce buffer copies when data is written to DFS. DataNode takes 30% less CPU. As a result, the format of data DFSClient sends changed and is incompatible with previous clients.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12381710/HADOOP-1702.patch
          against trunk revision 654315.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12381710/HADOOP-1702.patch against trunk revision 654315. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2432/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          +1. The patch looks good.

          Show
          Hairong Kuang added a comment - +1. The patch looks good.
          Raghu Angadi made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Raghu Angadi added a comment -

          Fixed findbugs warnings and ran 'ant patch'.

          Show
          Raghu Angadi added a comment - Fixed findbugs warnings and ran 'ant patch'.
          Raghu Angadi made changes -
          Attachment HADOOP-1702.patch [ 12381909 ]
          Raghu Angadi made changes -
          Hadoop Flags [Reviewed, Incompatible change] [Incompatible change, Reviewed]
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12381909/HADOOP-1702.patch
          against trunk revision 655593.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12381909/HADOOP-1702.patch against trunk revision 655593. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2452/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          I just committed this.

          Show
          Raghu Angadi added a comment - I just committed this.
          Raghu Angadi made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed, Incompatible change] [Incompatible change, Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #491 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/491/ )
          Robert Chansler made changes -
          Release Note Reduce buffer copies when data is written to DFS. DataNode takes 30% less CPU. As a result, the format of data DFSClient sends changed and is incompatible with previous clients. Reduced buffer copies as data is written to HDFS. The order of sending data bytes and control information has changed, but this will not be observed by client applications.
          Hadoop Flags [Reviewed, Incompatible change] [Incompatible change, Reviewed]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available Open Open
          1d 2h 5m 1 Raghu Angadi 09/May/08 22:32
          Open Open Patch Available Patch Available
          275d 21h 56m 2 Raghu Angadi 12/May/08 22:45
          Patch Available Patch Available Resolved Resolved
          1d 8h 49m 1 Raghu Angadi 14/May/08 07:35
          Resolved Resolved Closed Closed
          100d 13h 15m 1 Nigel Daley 22/Aug/08 20:50

            People

            • Assignee:
              Raghu Angadi
              Reporter:
              Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development