Hadoop Common
  1. Hadoop Common
  2. HADOOP-2758

Reduce memory copies when data is read from DFS

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.17.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      DataNode takes 50% less CPU while serving data to clients.

      Description

      Currently datanode and client part of DFS perform multiple copies of data on the 'read path' (i.e. path from storage on datanode to user buffer on the client). This jira reduces these copies by enhancing data read protocol and implementation of read on both datanode and the client. I will describe the changes in next comment.

      Requirement is that this fix should reduce CPU used and should not cause regression in any benchmarks. It might not improve the benchmarks since most benchmarks are not cpu bound.

      1. HADOOP-2758.patch
        33 kB
        Raghu Angadi
      2. HADOOP-2758.patch
        33 kB
        Raghu Angadi
      3. HADOOP-2758.patch
        33 kB
        Raghu Angadi
      4. HADOOP-2758.patch
        33 kB
        Raghu Angadi
      5. HADOOP-2758.patch
        28 kB
        Raghu Angadi
      6. HADOOP-2758.patch
        21 kB
        Raghu Angadi

        Issue Links

          Activity

          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #426 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/426/ )
          Hide
          Raghu Angadi added a comment -

          I committed this yesterday.

          Show
          Raghu Angadi added a comment - I committed this yesterday.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #419 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/419/ )
          Hide
          Konstantin Shvachko added a comment -

          +1 test results look encouraging.

          Show
          Konstantin Shvachko added a comment - +1 test results look encouraging.
          Hide
          Raghu Angadi added a comment - - edited

          Edit : typos

          More comparisions. I hope this shows the improvements.

          Test : Run 6 instances of 'cat 5GbFile > /dev/null' using a single node cluster. All the blocks are located on local disks (RAID0 I think).

          The hdfs tests include constant costs : Client cpu and kernel cpu not on behalf of user processes. Client cpu is at least as much as DataNodes. This implies, 25% improvement in wall clock time implies more that 50% improvement in DataNode cpu.

          Test Bound By Run1 Run2 Run3 Avg Percentage Note
          Trunk CPU 355 332 346 344 100%  
          Trunk + patch CPU 225 213 228 222 65%  
          cat command Disk IO 185 83 105 124 36% Not really comparable

          Even 21 instances of 'cat allBlocksForFile > /dev/null' was not CPU bound. 'cat' takes virtually zero cpu in user space.

          Show
          Raghu Angadi added a comment - - edited Edit : typos More comparisions. I hope this shows the improvements. Test : Run 6 instances of 'cat 5GbFile > /dev/null' using a single node cluster. All the blocks are located on local disks (RAID0 I think). The hdfs tests include constant costs : Client cpu and kernel cpu not on behalf of user processes. Client cpu is at least as much as DataNodes. This implies, 25% improvement in wall clock time implies more that 50% improvement in DataNode cpu. Test Bound By Run1 Run2 Run3 Avg Percentage Note Trunk CPU 355 332 346 344 100%   Trunk + patch CPU 225 213 228 222 65%   cat command Disk IO 185 83 105 124 36% Not really comparable Even 21 instances of 'cat allBlocksForFile > /dev/null' was not CPU bound. 'cat' takes virtually zero cpu in user space.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12376788/HADOOP-2758.patch
          against trunk revision 619744.

          @author +1. The patch does not contain any @author tags.

          tests included +1. The patch appears to include 6 new or modified tests.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new javac compiler warnings.

          release audit +1. The applied patch does not generate any new release audit warnings.

          findbugs +1. The patch does not introduce any new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12376788/HADOOP-2758.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1873/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          fixed findbugs warning.

          Many thanks to Mukund for running benchmarks with the patch. The results look fine and we are going to double check with another run. We want make sure 'effective buffering' hasn't changed.

          Show
          Raghu Angadi added a comment - fixed findbugs warning. Many thanks to Mukund for running benchmarks with the patch. The results look fine and we are going to double check with another run. We want make sure 'effective buffering' hasn't changed.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12376557/HADOOP-2758.patch
          against trunk revision 619744.

          @author +1. The patch does not contain any @author tags.

          tests included +1. The patch appears to include 6 new or modified tests.

          javadoc +1. The javadoc tool did not generate any warning messages.

          javac +1. The applied patch does not generate any new javac compiler warnings.

          release audit +1. The applied patch does not generate any new release audit warnings.

          findbugs -1. The patch appears to introduce 1 new Findbugs warnings.

          core tests +1. The patch passed core unit tests.

          contrib tests +1. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12376557/HADOOP-2758.patch against trunk revision 619744. @author +1. The patch does not contain any @author tags. tests included +1. The patch appears to include 6 new or modified tests. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs -1. The patch appears to introduce 1 new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/1849/console This message is automatically generated.
          Hide
          Raghu Angadi added a comment -

          minor variable changes.

          Show
          Raghu Angadi added a comment - minor variable changes.
          Hide
          Raghu Angadi added a comment -

          Thank Konstantin. Update patch includes all the changes.

          regd sendBuf and maxChunksPerPacket: these are made local to sendBlock(), I think thats what you meant (instead of sendChunks(), which invoked in sendBlock() in a loop).

          Will make this patch available once DFSIO benchmark shows the buffering is same as before.

          Show
          Raghu Angadi added a comment - Thank Konstantin. Update patch includes all the changes. regd sendBuf and maxChunksPerPacket: these are made local to sendBlock(), I think thats what you meant (instead of sendChunks(), which invoked in sendBlock() in a loop). Will make this patch available once DFSIO benchmark shows the buffering is same as before.
          Hide
          Konstantin Shvachko added a comment -
          1. The comments are very useful, thanks.
          2. "4 byte Legth of actual data" should read Le_n_gth.
          3. BlockSender
            • sendBuf member should be a local variable of sendChunks().
              We always call the constructor BlockSender() followed by only one call to sendChunks().
              So there is no advantage of allocating the buffer in the constructor and then using it in sendChunks().
            • Same with maxChunksPerPacket
          4. +1 other than that
          Show
          Konstantin Shvachko added a comment - The comments are very useful, thanks. "4 byte Legth of actual data" should read Le_n_gth. BlockSender sendBuf member should be a local variable of sendChunks(). We always call the constructor BlockSender() followed by only one call to sendChunks(). So there is no advantage of allocating the buffer in the constructor and then using it in sendChunks(). Same with maxChunksPerPacket +1 other than that
          Hide
          Raghu Angadi added a comment -

          More changes under src/test caused by fixing DATA_TRANFER_VERSION to DATA_TRANSFER_VERSION.

          Show
          Raghu Angadi added a comment - More changes under src/test caused by fixing DATA_TRANFER_VERSION to DATA_TRANSFER_VERSION.
          Hide
          Raghu Angadi added a comment -

          Konstantin, updated patch add all suggested changes except #3 above (reducing the number of member variables). I think it is a large enough change to have its own Jira.

          The essence of the patch is at the top of second comment (difference between "before" and "after"). Please feel free to ping me regd explanation of any changes.

          Show
          Raghu Angadi added a comment - Konstantin, updated patch add all suggested changes except #3 above (reducing the number of member variables). I think it is a large enough change to have its own Jira. The essence of the patch is at the top of second comment (difference between "before" and "after"). Please feel free to ping me regd explanation of any changes.
          Hide
          Konstantin Shvachko added a comment -

          I understand now DATA_TRANSFER_VERSION is a part of Data Transfer Protocol.
          > sendChunks(int) ... sends multiple CRC chunks.
          I would rather think of it that it is sending portions of data along with corresponding CRCs, rather than CRCs with the data attached.

          Show
          Konstantin Shvachko added a comment - I understand now DATA_TRANSFER_VERSION is a part of Data Transfer Protocol. > sendChunks(int) ... sends multiple CRC chunks. I would rather think of it that it is sending portions of data along with corresponding CRCs, rather than CRCs with the data attached.
          Hide
          Runping Qi added a comment -

          I expect you will see big difference if you run 4 concurrent cats, one for a file on different disk, vs. 4 concurrent dfs cats.

          Show
          Runping Qi added a comment - I expect you will see big difference if you run 4 concurrent cats, one for a file on different disk, vs. 4 concurrent dfs cats.
          Hide
          Raghu Angadi added a comment -

          Dhruba,

          As Konstantin suggested, it is an internal protocol and probably best to keep the protocol documentation in (large) code comments. Its harder to go back HADOOP-1707 to HADOOP-1134 to extract the relevant information.

          Show
          Raghu Angadi added a comment - Dhruba, As Konstantin suggested, it is an internal protocol and probably best to keep the protocol documentation in (large) code comments. Its harder to go back HADOOP-1707 to HADOOP-1134 to extract the relevant information.
          Hide
          dhruba borthakur added a comment -

          The most recent version of the Data Transfer Protocol can be found as an attachment to HADOOP-1707. It might be worthwhile to update this document with the changes proposed in this JIRA.

          Show
          dhruba borthakur added a comment - The most recent version of the Data Transfer Protocol can be found as an attachment to HADOOP-1707 . It might be worthwhile to update this document with the changes proposed in this JIRA.
          Hide
          Raghu Angadi added a comment -

          Regd couple of concerns in Konstantin's review :

          • > 2. Do we still need the notion of a chunk? [...]
            • I think so. A CRC chunk is still central to many things that DataNode and DFSClients do. It is very useful for discussions, descriptions and even in code to have a single word to consistently describe this essential unit of DFS data. If we see a member called 'sendChunk()', its clear what it sends. For e.g. this patch renamed sendChunk() to sendChunks(int) because it sends multiple CRC chunks.
          • > 5. DATA_TRANSFER_VERSION : I generally do not understand what is the meaning of this constant, [...]
            • data transfers do not use RPCs. As noted in the comment, it unfortunately does depend on Datanode serializations. Probably it should not. This is analogous RPC versions and a Protocol version, which are at two different levels of the stack.
          Show
          Raghu Angadi added a comment - Regd couple of concerns in Konstantin's review : > 2. Do we still need the notion of a chunk? [...] I think so. A CRC chunk is still central to many things that DataNode and DFSClients do. It is very useful for discussions, descriptions and even in code to have a single word to consistently describe this essential unit of DFS data. If we see a member called 'sendChunk()', its clear what it sends. For e.g. this patch renamed sendChunk() to sendChunks(int) because it sends multiple CRC chunks. > 5. DATA_TRANSFER_VERSION : I generally do not understand what is the meaning of this constant, [...] data transfers do not use RPCs. As noted in the comment, it unfortunately does depend on Datanode serializations. Probably it should not. This is analogous RPC versions and a Protocol version, which are at two different levels of the stack.
          Hide
          Raghu Angadi added a comment - - edited

          Comparision of single instance of 'dfs -cat 5Gbfile > /dev/null" with 'cat 5Gbfile > /dev/null'. All the data resides locally on a 4 disk RAID0 partition :

          min:sec cat dfs -cat with 0.16 dfs -cat with the patch
          run 1 2:40 3:44 3:24
          run 2 2:56 3:05 3:51
          run 3 3:01 3:18 2:51

          What would you conclude? Both of the obvious conclusions are incorrect :

          1. dfs -cat is almost as good as simple cat.
          2. this patch does not help much.

          If we had a single disk partition, the numbers would be even closer.

          Show
          Raghu Angadi added a comment - - edited Comparision of single instance of 'dfs -cat 5Gbfile > /dev/null" with 'cat 5Gbfile > /dev/null'. All the data resides locally on a 4 disk RAID0 partition : min:sec cat dfs -cat with 0.16 dfs -cat with the patch run 1 2:40 3:44 3:24 run 2 2:56 3:05 3:51 run 3 3:01 3:18 2:51 What would you conclude? Both of the obvious conclusions are incorrect : dfs -cat is almost as good as simple cat. this patch does not help much. If we had a single disk partition, the numbers would be even closer.
          Hide
          Raghu Angadi added a comment -

          Note regd 'dfs -cat' numbers: These are end to end tests and numbers vary depending on how many instances we run. Just as in any end-to-end test there are multiple factors that are not affected by this patch. This patch reduces CPU consumed by DataNode while serving data. It cannot be directly comapred from 'dfs -cat' numbers.

          I have semi-directly calculated DataNode with the patch takes 35-45% of CPU it used to take before. This calculation uses 9/10 ratio from HADOOP-2144. 'top' on my dev box truncates summed up cpu to 99.9 (unlike on the machine used in HADOOP-2144), other wise we could directly compare CPU taken by DataNode instead of calculating it indirectly.

          Sameer asked me to compare single instance of 'dfs -cat' and regular shell 'cat'. I will add those numbers in the next comment.

          Show
          Raghu Angadi added a comment - Note regd 'dfs -cat' numbers: These are end to end tests and numbers vary depending on how many instances we run. Just as in any end-to-end test there are multiple factors that are not affected by this patch. This patch reduces CPU consumed by DataNode while serving data. It cannot be directly comapred from 'dfs -cat' numbers. I have semi-directly calculated DataNode with the patch takes 35-45% of CPU it used to take before . This calculation uses 9/10 ratio from HADOOP-2144 . 'top' on my dev box truncates summed up cpu to 99.9 (unlike on the machine used in HADOOP-2144 ), other wise we could directly compare CPU taken by DataNode instead of calculating it indirectly. Sameer asked me to compare single instance of 'dfs -cat' and regular shell 'cat'. I will add those numbers in the next comment.
          Hide
          Raghu Angadi added a comment - - edited

          Thanks for the detailed review Konstatin.

          Just a clarification to the readers: except 4, rest are not introduced by this patch. I will try to fix some of them.

          When this protocol was first introduced, it was described in the jira (HADOOP-1134). Since then, it has evolved. I agree that it will be better to have the ascii diagrams in the code comments itself. I will surely add that.

          Show
          Raghu Angadi added a comment - - edited Thanks for the detailed review Konstatin. Just a clarification to the readers: except 4, rest are not introduced by this patch. I will try to fix some of them. When this protocol was first introduced, it was described in the jira ( HADOOP-1134 ). Since then, it has evolved. I agree that it will be better to have the ascii diagrams in the code comments itself. I will surely add that.
          Hide
          Konstantin Shvachko added a comment -
          1. Could you please add comments about the format of the send/receive protocol for blocks.
            We do not have formal definition of the protocol, and it is very hard to understand the code
            without knowing the general structure of the packet and the bytes and nits it is composed of.
          2. Do we still need the notion of a chunk? Previously it corresponded to 512 data bytes prefixed
            by a 4 byte checksum. Now it is just a sequence if checksums followed by data.
            I think any mentioning of chunks should be either removed or replaced by something else in order
            to avoid confusion with the previous version.
          3. Too many (20 or so) members in BlockSender and BlockReceiver.
            • Some of them can be made local variables
              checksumSize
            • some are used merely to pass parameters between or into internal methods
              BlockSender.throttler
              BlockSender.out
              BlockSender.corruptChecksumOk
              BlockSender.chunkOffsetOK
            • and some are derivatives (computable) from other variables.
            • Please check the BlockReceiver as well.
            • Most of them are not introduced by this patch, but it'd be great if you could work on
              that to make the code more understandable. Right now it is in incomprehensible state.
          4. Constants should be named in all capital letterers.
            static final int packetHeaderLen =
            
          5. DATA_TRANFER_VERSION
            • Spelling TRANSFER
            • Please leave javadoc-style comments in place.
            • Please remove previous version descriptions.
            • I generally do not understand what is the meaning of this constant, introduced btw in HADOOP-1134.
              If DatanodeInfo serialization changes then both ClientProtocol and DatanodeProtocol versions should be
              incremented. Why do we need extra version?
              Only the last version changes should be described, the old ones could be retrieved from svn.
          6. enumerateThreadGroup() is not used locally.
          7. Spelling: should be "stream"
              private InputStream blockIn; // data strean
            

          Sorry, I did not get to the essentials of your patch. But it is really hard to understand without
          knowing the details of the protocol.

          Show
          Konstantin Shvachko added a comment - Could you please add comments about the format of the send/receive protocol for blocks. We do not have formal definition of the protocol, and it is very hard to understand the code without knowing the general structure of the packet and the bytes and nits it is composed of. Do we still need the notion of a chunk? Previously it corresponded to 512 data bytes prefixed by a 4 byte checksum. Now it is just a sequence if checksums followed by data. I think any mentioning of chunks should be either removed or replaced by something else in order to avoid confusion with the previous version. Too many (20 or so) members in BlockSender and BlockReceiver. Some of them can be made local variables checksumSize some are used merely to pass parameters between or into internal methods BlockSender.throttler BlockSender.out BlockSender.corruptChecksumOk BlockSender.chunkOffsetOK and some are derivatives (computable) from other variables. Please check the BlockReceiver as well. Most of them are not introduced by this patch, but it'd be great if you could work on that to make the code more understandable. Right now it is in incomprehensible state. Constants should be named in all capital letterers. static final int packetHeaderLen = DATA_TRANFER_VERSION Spelling TRANSFER Please leave javadoc-style comments in place. Please remove previous version descriptions. I generally do not understand what is the meaning of this constant, introduced btw in HADOOP-1134 . If DatanodeInfo serialization changes then both ClientProtocol and DatanodeProtocol versions should be incremented. Why do we need extra version? Only the last version changes should be described, the old ones could be retrieved from svn. enumerateThreadGroup() is not used locally. Spelling: should be "stream" private InputStream blockIn; // data strean Sorry, I did not get to the essentials of your patch. But it is really hard to understand without knowing the details of the protocol.
          Hide
          Raghu Angadi added a comment -

          With 6 instances of 'dfs -cat 5GbFile > /dev/null' both tests were cpu bound.

          with out the patch : ~ 6.5 min
          with the patch : ~ 3.5 min.

          Show
          Raghu Angadi added a comment - With 6 instances of 'dfs -cat 5GbFile > /dev/null' both tests were cpu bound. with out the patch : ~ 6.5 min with the patch : ~ 3.5 min.
          Hide
          Raghu Angadi added a comment -

          Also this is friendly to Java's FileChannel.transferTo() interface. That can reduce one more memory copy. This will be good experiment.

          Show
          Raghu Angadi added a comment - Also this is friendly to Java's FileChannel.transferTo() interface. That can reduce one more memory copy. This will be good experiment.
          Hide
          Raghu Angadi added a comment -

          Attached patch removes extra buffer copies when data is read from the data node (by client or while replicating).

          • before : disk --> large bufferedinputstream --> small datanode buffer --> large bufferedoutputstream --> socket.
          • after : disk --> large datanode buffer --> socket
          • each arrow represents a memory copy. cost of arrows at the ends is share between user and kernel, I think (using direct buffer might further reduce that, will try.).

          I will post more microbenchmarks similar to last comment.

          We can reduce one copy on the DFSClient. Current readChunk() interface in FSInputChecker does not allow it. We could add optional readChunks() so that an implementation can get access to user's complete buffer. There will be a default implementation of this. Should I file a jira?

          This patch changes the DATA_TRANSFER_PROTOCOL a bit.

          Currently there are no improvements in buffering whilre writing data to DFS. I will do that in a follow up jira.

          All the unit tests pass. I will run them on windows as well. No new tests are added since this does not actually change any functionality and purely a performance improvement.

          Show
          Raghu Angadi added a comment - Attached patch removes extra buffer copies when data is read from the data node (by client or while replicating). before : disk --> large bufferedinputstream --> small datanode buffer --> large bufferedoutputstream --> socket. after : disk --> large datanode buffer --> socket each arrow represents a memory copy. cost of arrows at the ends is share between user and kernel, I think (using direct buffer might further reduce that, will try.). I will post more microbenchmarks similar to last comment. We can reduce one copy on the DFSClient. Current readChunk() interface in FSInputChecker does not allow it. We could add optional readChunks() so that an implementation can get access to user's complete buffer. There will be a default implementation of this. Should I file a jira? This patch changes the DATA_TRANSFER_PROTOCOL a bit. Currently there are no improvements in buffering whilre writing data to DFS. I will do that in a follow up jira. All the unit tests pass. I will run them on windows as well. No new tests are added since this does not actually change any functionality and purely a performance improvement.
          Hide
          Raghu Angadi added a comment -

          With a prelimirary patch that removes extra copies on datanode while reading a block, the results are promising.
          Ran 4 instances of 'dfs -cat 5GbFile > /dev/null' similar to the tests in HADOOP-2144. All the blocks are local.

          branch-0.16 : ~4 min. cpu bound. user cpu is 3 times the kernel cpu.
          trunk + patch : ~3min. disk bound. user cpu is 2 times the kernel cpu. not that much of cpu was left (~10-20%).

          Also from HADOOP-2144, datanode cpu is around 0.9 times DFSClient cpu. Even after ignoring idle cpu in the second test, datanode takes less than half of cpu with the patch. This includes both user and kernel cpu taken by datanode. Assuming kernel cpu is same in both cases, the user cpu taken by datanode in second test would much less than half (may be closer 1/3rd).

          Show
          Raghu Angadi added a comment - With a prelimirary patch that removes extra copies on datanode while reading a block, the results are promising. Ran 4 instances of 'dfs -cat 5GbFile > /dev/null' similar to the tests in HADOOP-2144 . All the blocks are local. branch-0.16 : ~4 min. cpu bound. user cpu is 3 times the kernel cpu. trunk + patch : ~3min. disk bound. user cpu is 2 times the kernel cpu. not that much of cpu was left (~10-20%). Also from HADOOP-2144 , datanode cpu is around 0.9 times DFSClient cpu. Even after ignoring idle cpu in the second test, datanode takes less than half of cpu with the patch. This includes both user and kernel cpu taken by datanode. Assuming kernel cpu is same in both cases, the user cpu taken by datanode in second test would much less than half (may be closer 1/3rd).

            People

            • Assignee:
              Raghu Angadi
              Reporter:
              Raghu Angadi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development