Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-881

Refactor DataNode Packet header into DataTransferProtocol

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The Packet Header format is used ad-hoc in various places. This JIRA is to refactor it into a class inside DataTransferProtocol (like was done with PipelineAck)

      1. hdfs-881.txt
        19 kB
        Todd Lipcon
      2. hdfs-881.txt
        26 kB
        Todd Lipcon

        Issue Links

          Activity

          Hide
          Todd Lipcon added a comment -

          This implements the proposed refactor.

          Couple notes:

          • I had to implement read/write fields both for OutputStream and for ByteBuffer in/out. It's a bit ugly, but I wasn't able to find a way of avoiding one or the other without adding extra double buffering or writing some kind of ByteBuffer<->DataOutput translation class in Common util. If we think this is necessary, let's do it in another JIRA.
          • I changed the meaning of PKT_HEADER_LEN when moving it into the new class. It used to be that nearly every instance of its use also added SIZE_OF_INTEGER, so I just incorporated that into the constant to make the code simpler everywhere else.
          Show
          Todd Lipcon added a comment - This implements the proposed refactor. Couple notes: I had to implement read/write fields both for OutputStream and for ByteBuffer in/out. It's a bit ugly, but I wasn't able to find a way of avoiding one or the other without adding extra double buffering or writing some kind of ByteBuffer<->DataOutput translation class in Common util. If we think this is necessary, let's do it in another JIRA. I changed the meaning of PKT_HEADER_LEN when moving it into the new class. It used to be that nearly every instance of its use also added SIZE_OF_INTEGER, so I just incorporated that into the constant to make the code simpler everywhere else.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12429704/hdfs-881.txt
          against trunk revision 897068.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429704/hdfs-881.txt against trunk revision 897068. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/88/console This message is automatically generated.
          Hide
          Konstantin Boudnik added a comment -

          Oh, great! You've gone a step further and did what I had started in HDFS-608

          I think Druba's comment about performance measurement makes sense and it is something which needs to be tested.

          Show
          Konstantin Boudnik added a comment - Oh, great! You've gone a step further and did what I had started in HDFS-608 I think Druba's comment about performance measurement makes sense and it is something which needs to be tested.
          Hide
          Todd Lipcon added a comment -

          Hey Konstantin,

          Sorry, didn't notice HDFS-608 before - I just did this as I was working on HDFS-877.

          I can probably get a chance to run some benchmarks tomorrow. I don't anticipate this will be a problem, as it's adding only one method-call indirection per packet, and each packet contains a reasonably large amount of data. Compared to other operations like checksumming, this is just noise.

          Show
          Todd Lipcon added a comment - Hey Konstantin, Sorry, didn't notice HDFS-608 before - I just did this as I was working on HDFS-877 . I can probably get a chance to run some benchmarks tomorrow. I don't anticipate this will be a problem, as it's adding only one method-call indirection per packet, and each packet contains a reasonably large amount of data. Compared to other operations like checksumming, this is just noise.
          Hide
          Jakob Homan added a comment -

          Patch review:

          • Overall looks good, but unfortunately has become unsync'ed with trunk. Todd, if you can re-base, I'll work on getting this committed.
          • It might be good to move the sanity check (line 50-52 in patch) into a method on the PacketHeader (
            boolean sanityCheck(int lastSeqNo)

            ) class to make PacketHeader repsonsible for its own sanity check and to simplify DFSClient. Less important might be a method to do the debug log entry in PacketHeader rather than DFSClient; either is fine.

          • Once this is done, it'll be easier to unit test the PacketHeader class, which should also be done, if just for the sanityCheck and the writable methods. Otherwise, we can consider this a re-factor and verify correctness through existing tests.
          • Having two readFields seems reasonable for now.

          This patch will lead to more object creation/churn, but that shouldn't be an issue; I think benchmarks before committing will be unnecessary.

          Show
          Jakob Homan added a comment - Patch review: Overall looks good, but unfortunately has become unsync'ed with trunk. Todd, if you can re-base, I'll work on getting this committed. It might be good to move the sanity check (line 50-52 in patch) into a method on the PacketHeader ( boolean sanityCheck( int lastSeqNo) ) class to make PacketHeader repsonsible for its own sanity check and to simplify DFSClient. Less important might be a method to do the debug log entry in PacketHeader rather than DFSClient; either is fine. Once this is done, it'll be easier to unit test the PacketHeader class, which should also be done, if just for the sanityCheck and the writable methods. Otherwise, we can consider this a re-factor and verify correctness through existing tests. Having two readFields seems reasonable for now. This patch will lead to more object creation/churn, but that shouldn't be an issue; I think benchmarks before committing will be unnecessary.
          Hide
          Jakob Homan added a comment -

          Canceling patch post-review...

          Show
          Jakob Homan added a comment - Canceling patch post-review...
          Hide
          Todd Lipcon added a comment -

          Here's an updated patch that addresses Jakob's review. I ran commit-tests locally and they passed.

          Show
          Todd Lipcon added a comment - Here's an updated patch that addresses Jakob's review. I ran commit-tests locally and they passed.
          Hide
          Jakob Homan added a comment -

          > I ran commit-tests locally and they passed.
          Does this mean just the commit test target or all of the tests? Since we're relying on existing tests to verify this refactor, it'd be good to include the test-patch and results for the whole test suite. Hudson is still quite flaky, so it's probably best not to rely on it...

          Show
          Jakob Homan added a comment - > I ran commit-tests locally and they passed. Does this mean just the commit test target or all of the tests? Since we're relying on existing tests to verify this refactor, it'd be good to include the test-patch and results for the whole test suite. Hudson is still quite flaky, so it's probably best not to rely on it...
          Hide
          Jakob Homan added a comment -

          +1 otherwise. Thanks for updating the patch.

          Show
          Jakob Homan added a comment - +1 otherwise. Thanks for updating the patch.
          Hide
          Todd Lipcon added a comment -

          test-patch shows:
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 3 new or modified tests.
          [exec]
          [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 system tests framework. The patch passed system tests framework compile.
          [exec]

          The one javadoc warning is:

          [exec] [javadoc] /data/1/scratch/patchqueue/patch-worker-27503/patch_89/svnrepo/src/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java:40: warning - Tag @see: reference not found: org.apache.hadoop.hdfs.server.datanode.metrics.DataNodeStatisticsMBean

          which is unrelated and already covered in HDFS-1369.

          I also ran the full unit test suite, got some failures, but all of them are already reported in other JIRAs and are unrelated to this patch.

          Show
          Todd Lipcon added a comment - test-patch shows: [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] -1 javadoc. The javadoc tool appears to have generated 1 warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile. [exec] The one javadoc warning is: [exec] [javadoc] /data/1/scratch/patchqueue/patch-worker-27503/patch_89/svnrepo/src/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java:40: warning - Tag @see: reference not found: org.apache.hadoop.hdfs.server.datanode.metrics.DataNodeStatisticsMBean which is unrelated and already covered in HDFS-1369 . I also ran the full unit test suite, got some failures, but all of them are already reported in other JIRAs and are unrelated to this patch.
          Hide
          Jakob Homan added a comment -

          I've committed this. Resolving as fixed. Thanks, Todd.

          Show
          Jakob Homan added a comment - I've committed this. Resolving as fixed. Thanks, Todd.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #383 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/383/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #383 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/383/ )

            People

            • Assignee:
              Todd Lipcon
              Reporter:
              Todd Lipcon
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development