Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2003

Separate FSEditLog reading logic from editLog memory state building logic

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: Edit log branch (HDFS-1073), 0.23.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently FSEditLogLoader has code for reading from an InputStream interleaved with code which updates the FSNameSystem and FSDirectory. This makes it difficult to read an edit log without having a whole load of other object initialised, which is problematic if you want to do things like count how many transactions are in a file etc.

      This patch separates the reading of the stream and the building of the memory state.

      1. HDFS-2003.diff
        54 kB
        Ivan Kelly
      2. HDFS-2003.diff
        54 kB
        Ivan Kelly
      3. HDFS-2003.diff
        54 kB
        Ivan Kelly
      4. HDFS-2003-replicationfix-delta.diff
        3 kB
        Ivan Kelly
      5. hdfs-2003.txt
        54 kB
        Todd Lipcon
      6. HDFS-2003.diff
        51 kB
        Ivan Kelly
      7. hdfs-2003.txt
        51 kB
        Todd Lipcon
      8. hdfs-2003.txt
        50 kB
        Todd Lipcon
      9. 2003-delta.txt
        6 kB
        Todd Lipcon
      10. HDFS-2003.diff
        54 kB
        Ivan Kelly
      11. HDFS-2003.diff
        79 kB
        Ivan Kelly
      12. HDFS-2003.diff
        78 kB
        Ivan Kelly
      13. HDFS-2003.diff
        76 kB
        Ivan Kelly

        Issue Links

          Activity

          Hide
          Ivan Kelly added a comment -

          Patch applies against 1073 branch.

          Show
          Ivan Kelly added a comment - Patch applies against 1073 branch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480553/HDFS-2003.diff
          against trunk revision 1127966.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/637//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480553/HDFS-2003.diff against trunk revision 1127966. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/637//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Cleaned up imports

          Show
          Ivan Kelly added a comment - Cleaned up imports
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480926/HDFS-2003.diff
          against trunk revision 1128987.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/662//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480926/HDFS-2003.diff against trunk revision 1128987. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/662//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Hey Ivan. Any chance you've been able to do a performance test of this with a large edit log?

          I don't imagine the performance implications will be problematic, but would like a sanity check before +1ing

          Show
          Todd Lipcon added a comment - Hey Ivan. Any chance you've been able to do a performance test of this with a large edit log? I don't imagine the performance implications will be problematic, but would like a sanity check before +1ing
          Hide
          Ivan Kelly added a comment -

          I was actually discussing the performance implications with Jitendra earlier. I'll see if I can get a log to test it with tomorrow.

          Show
          Ivan Kelly added a comment - I was actually discussing the performance implications with Jitendra earlier. I'll see if I can get a log to test it with tomorrow.
          Hide
          Todd Lipcon added a comment -

          any luck? I have some big logs around here if you want me to try.

          Show
          Todd Lipcon added a comment - any luck? I have some big logs around here if you want me to try.
          Hide
          Todd Lipcon added a comment -

          A few comments on the patch itself:

          • FSEditLogLoader.java line 174, length variable is unused.
          • DelegationTokenIdentifier import in the same file is unused
          • From the perspective of FSEditLogLoader, an EOFException while reading the opcode (expected) is treated the same as an EOFException in the middle of reading one of the ops (unexpected). This seems unintentional, since a truncated log file is not normal, right?
          • I'm thinking it might be slightly cleaner to make a new class like FSEditLogReader, which is instantiated with an InputStream and logVersion. It would then expose just a readOp() method. I think that will make it easier to mock up sources of edits in the future. What do you think?
          • Style nit: camelCase (eg "addCloseOp") is preferred for things like the variable "addcloseop". Another option which might make the code a little neater is to name the outer op (which hasn't been downcasted yet) something like "genericOp", and then use "op" for all of the downcasted ones.

          Perhaps more to come - got to run and catch a train

          Show
          Todd Lipcon added a comment - A few comments on the patch itself: FSEditLogLoader.java line 174, length variable is unused. DelegationTokenIdentifier import in the same file is unused From the perspective of FSEditLogLoader, an EOFException while reading the opcode (expected) is treated the same as an EOFException in the middle of reading one of the ops (unexpected). This seems unintentional, since a truncated log file is not normal, right? I'm thinking it might be slightly cleaner to make a new class like FSEditLogReader, which is instantiated with an InputStream and logVersion. It would then expose just a readOp() method. I think that will make it easier to mock up sources of edits in the future. What do you think? Style nit: camelCase (eg "addCloseOp") is preferred for things like the variable "addcloseop". Another option which might make the code a little neater is to name the outer op (which hasn't been downcasted yet) something like "genericOp", and then use "op" for all of the downcasted ones. Perhaps more to come - got to run and catch a train
          Hide
          Ivan Kelly added a comment -

          I'm still waiting to get access to some big logs. If you have some to hand, that'd be great. I'll address the other comments and upload a new patch.

          Show
          Ivan Kelly added a comment - I'm still waiting to get access to some big logs. If you have some to hand, that'd be great. I'll address the other comments and upload a new patch.
          Hide
          Ivan Kelly added a comment -

          From the perspective of FSEditLogLoader, an EOFException while reading the opcode (expected) is treated the same as an EOFException in the middle of reading one of the ops (unexpected). This seems unintentional, since a truncated log file is not normal, right?

          This is intentional. We read from the journal which has the most edits available to read. If this happens to be a journal with a truncated file, that journal is still the journal with the most up to date logs. Do you disagree?

          I'm thinking it might be slightly cleaner to make a new class like FSEditLogReader, which is instantiated with an InputStream and logVersion. It would then expose just a readOp() method. I think that will make it easier to mock up sources of edits in the future. What do you think?

          Makes sense.

          Show
          Ivan Kelly added a comment - From the perspective of FSEditLogLoader, an EOFException while reading the opcode (expected) is treated the same as an EOFException in the middle of reading one of the ops (unexpected). This seems unintentional, since a truncated log file is not normal, right? This is intentional. We read from the journal which has the most edits available to read. If this happens to be a journal with a truncated file, that journal is still the journal with the most up to date logs. Do you disagree? I'm thinking it might be slightly cleaner to make a new class like FSEditLogReader, which is instantiated with an InputStream and logVersion. It would then expose just a readOp() method. I think that will make it easier to mock up sources of edits in the future. What do you think? Makes sense.
          Hide
          Ivan Kelly added a comment -

          Uploaded patch addressing Todd's comments

          Show
          Ivan Kelly added a comment - Uploaded patch addressing Todd's comments
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481369/HDFS-2003.diff
          against trunk revision 1130870.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 9 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/697//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481369/HDFS-2003.diff against trunk revision 1130870. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/697//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          This is intentional. We read from the journal which has the most edits available to read. If this happens to be a journal with a truncated file, that journal is still the journal with the most up to date logs. Do you disagree?

          In my opinion, that's the responsibility of the "edit log recovery" process to determine, and then truncate the file at the correct length. But, I see your point as well, and don't feel strongly about it. Either way, though, it's a distinct change from just the refactor - can we keep the current behavior in the refactor, and then make that behavioral change separately?

          Some other thoughts:

          • Do we need Reader to be an inner class of FSEditLogOp? I find it a little strange to have all of the Reader code, and then the "final Codes opCode" and "final long txid;" right after it.

          I think the patch would produce fewer conflicts on merge if we made the following change:

          • Keep FSEditLogOpCodes as is (so we don't have changes throughout EditLogFileInputStream/OutputStream/FSEditLog/Loader/OEV/etc. (this will help prevent merge conflicts against HDFS-1936 in particular)

          One idea, which you can take or leave: what if we did added a Class<? extends FSEditLogOp> field to the Codes enum, and then did the following:
          in Reader constructor:

          EnumMap<Codes, FSEditLogOp> opInstances;
          for (Codes c : Codes.values()) {
            opInstances.put(c, c.getOpClass().newInstance());
          }
          

          in readOp instead of the switch statement:

          FSEditLogOp op = opInstances.get(opCode);
          op.readFields(in, logVersion);
          

          This idea would remove the object overhead of creating new objects for each case, make opcodes more like writables, and get rid of the big switch statement. Might also be a good first step towards sharing more code between the OEV and the normal edits loader. This is just a thought, though - if you don't like it, ignore me

          Show
          Todd Lipcon added a comment - This is intentional. We read from the journal which has the most edits available to read. If this happens to be a journal with a truncated file, that journal is still the journal with the most up to date logs. Do you disagree? In my opinion, that's the responsibility of the "edit log recovery" process to determine, and then truncate the file at the correct length. But, I see your point as well, and don't feel strongly about it. Either way, though, it's a distinct change from just the refactor - can we keep the current behavior in the refactor, and then make that behavioral change separately? Some other thoughts: Do we need Reader to be an inner class of FSEditLogOp? I find it a little strange to have all of the Reader code, and then the "final Codes opCode" and "final long txid;" right after it. I think the patch would produce fewer conflicts on merge if we made the following change: Keep FSEditLogOpCodes as is (so we don't have changes throughout EditLogFileInputStream/OutputStream/FSEditLog/Loader/OEV/etc. (this will help prevent merge conflicts against HDFS-1936 in particular) One idea, which you can take or leave: what if we did added a Class<? extends FSEditLogOp> field to the Codes enum, and then did the following: in Reader constructor: EnumMap<Codes, FSEditLogOp> opInstances; for (Codes c : Codes.values()) { opInstances.put(c, c.getOpClass().newInstance()); } in readOp instead of the switch statement: FSEditLogOp op = opInstances.get(opCode); op.readFields(in, logVersion); This idea would remove the object overhead of creating new objects for each case, make opcodes more like writables, and get rid of the big switch statement. Might also be a good first step towards sharing more code between the OEV and the normal edits loader. This is just a thought, though - if you don't like it, ignore me
          Hide
          Todd Lipcon added a comment -

          Test stup

          I ran performance tests with an fsimage/edits pair I had from a real life cluster. The fsimage is about ~2G and has 12.5M files, and the edit log is exactly 2GB (I truncated it with dd to that length). I ran the NN with the following JVM options: -Xms14g -Xmx14g -XX:+UseCompressedOops.

          With Parallel (default) GC:

          I loaded the edit log 3 times each with the patch and without the patch from a local SATA disk.

          Without the patch, the logs loaded in 84 seconds (consistent across the 3 runs). With the patch, it loaded in 87s, consistent across the three runs.

          With CMS GC:

          I then added the JVM option: -XX:+UseConcMarkSweepGC, since that's more likely the GC in use on most large clusters.

          With the patch: Loaded in 86 seconds and incurred 213 young generation collections while loading the edit log, which added up to a total of 2.208 seconds in young gen GC.
          Without the patch: 84 seconds, 211 young gen GCs, adding up to 2.174 seconds.

          Summary

          The patch seems to have a very marginal impact on amount of time spent in GC, which makes sense since the objects are very short-lived and young-generation GC time is proportional to live object size, not garbage size. The patch seems to have about a 3-4% negative impact on overall wall clock time of loading the log.

          Do you guys think this is acceptable? In most of the clusters I see, edit logs tend to be much smaller than this, and startup time is dominated by loading the image and collecting block reports, not edits replay. So, I tend to think the improved code cleanliness of this patch is worth the perf hit.

          Show
          Todd Lipcon added a comment - Test stup I ran performance tests with an fsimage/edits pair I had from a real life cluster. The fsimage is about ~2G and has 12.5M files, and the edit log is exactly 2GB (I truncated it with dd to that length). I ran the NN with the following JVM options: -Xms14g -Xmx14g -XX:+UseCompressedOops. With Parallel (default) GC: I loaded the edit log 3 times each with the patch and without the patch from a local SATA disk. Without the patch, the logs loaded in 84 seconds (consistent across the 3 runs). With the patch, it loaded in 87s, consistent across the three runs. With CMS GC: I then added the JVM option: -XX:+UseConcMarkSweepGC, since that's more likely the GC in use on most large clusters. With the patch: Loaded in 86 seconds and incurred 213 young generation collections while loading the edit log, which added up to a total of 2.208 seconds in young gen GC. Without the patch: 84 seconds, 211 young gen GCs, adding up to 2.174 seconds. Summary The patch seems to have a very marginal impact on amount of time spent in GC, which makes sense since the objects are very short-lived and young-generation GC time is proportional to live object size, not garbage size. The patch seems to have about a 3-4% negative impact on overall wall clock time of loading the log. Do you guys think this is acceptable? In most of the clusters I see, edit logs tend to be much smaller than this, and startup time is dominated by loading the image and collecting block reports, not edits replay. So, I tend to think the improved code cleanliness of this patch is worth the perf hit.
          Hide
          Todd Lipcon added a comment -

          BTW, I thought of another reason why EOFException shouldn't be treated the same if it comes in the middle of a transaction:

          A lot of the transaction serialization formats have length-prefixed strings. In the case that there is corruption in the file, I often find we get into a situation where it's trying to read a length-prefixed string but instead gets some other random bytes (eg part of a filename). This causes it to issue a read() for a very large number of bytes, which, depending on how much heap is available, usually results in an OOME or an early EOFException. In the case of the EOFException, we don't want to treat it as a successful log read, which is what the code does now.

          Show
          Todd Lipcon added a comment - BTW, I thought of another reason why EOFException shouldn't be treated the same if it comes in the middle of a transaction: A lot of the transaction serialization formats have length-prefixed strings. In the case that there is corruption in the file, I often find we get into a situation where it's trying to read a length-prefixed string but instead gets some other random bytes (eg part of a filename). This causes it to issue a read() for a very large number of bytes, which, depending on how much heap is available, usually results in an OOME or an early EOFException. In the case of the EOFException, we don't want to treat it as a successful log read, which is what the code does now.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481572/HDFS-2003.diff
          against trunk revision 1132525.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/715//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481572/HDFS-2003.diff against trunk revision 1132525. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/715//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          I've uploaded a new patch which reduces the number of objects as suggested. I didn't put it in the FSEditLogOpCodes enum though as I didn't want to create a circular dependency there. I've moved the Reader to the end of FSEditLogOp. The reason I didn't put this into a new files is simply because I wanted to limit the number of files in the package.

          I've also changed the implementation so that an EOF in the middle of a transaction gets propogated as an error. Not as a EOFException though, but as a IOException.

          Show
          Ivan Kelly added a comment - I've uploaded a new patch which reduces the number of objects as suggested. I didn't put it in the FSEditLogOpCodes enum though as I didn't want to create a circular dependency there. I've moved the Reader to the end of FSEditLogOp. The reason I didn't put this into a new files is simply because I wanted to limit the number of files in the package. I've also changed the implementation so that an EOF in the middle of a transaction gets propogated as an error. Not as a EOFException though, but as a IOException.
          Hide
          Todd Lipcon added a comment -
          • Looks like a missing import in FSEditLogLoader of FSEditLogOp.*?

          I fixed the above issue and benchmarked again on the same machine/setup. Now the performance is identical to the pre-patch version. Nice

          I'll look over the actual code and comment momentarily.

          Show
          Todd Lipcon added a comment - Looks like a missing import in FSEditLogLoader of FSEditLogOp.*? I fixed the above issue and benchmarked again on the same machine/setup. Now the performance is identical to the pre-patch version. Nice I'll look over the actual code and comment momentarily.
          Hide
          Todd Lipcon added a comment -

          Here is a patch on top of your latest (https://issues.apache.org/jira/secure/attachment/12481572/HDFS-2003.diff) which makes the following changes:

          • made some methods static which didn't access any member vars
          • readOp() now returns null for an expected end-of-file – this is more like how most iterator interfaces work, I think a bit clearer than always having to catch EOF
          • fixed a couple of warnings (unused var, unnecessary suppresswarnings)

          I had one thought as I was going through this: can this actually go into trunk and then be merged into the 1073 branch? There isn't anything here that doesn't make sense in trunk, and since it's a pretty clean refactor I don't see why we need to do it on branch.

          Show
          Todd Lipcon added a comment - Here is a patch on top of your latest ( https://issues.apache.org/jira/secure/attachment/12481572/HDFS-2003.diff ) which makes the following changes: made some methods static which didn't access any member vars readOp() now returns null for an expected end-of-file – this is more like how most iterator interfaces work, I think a bit clearer than always having to catch EOF fixed a couple of warnings (unused var, unnecessary suppresswarnings) I had one thought as I was going through this: can this actually go into trunk and then be merged into the 1073 branch? There isn't anything here that doesn't make sense in trunk, and since it's a pretty clean refactor I don't see why we need to do it on branch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481618/2003-delta.txt
          against trunk revision 1132715.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/726//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481618/2003-delta.txt against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/726//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          Rebased patch against trunk, using the new LayoutVersion support, etc.

          Show
          Todd Lipcon added a comment - Rebased patch against trunk, using the new LayoutVersion support, etc.
          Hide
          Ivan Kelly added a comment -

          Im +1 on all these changes. One small question though. You're delta has:

          • @SuppressWarnings("deprecation,unchecked")
            + @SuppressWarnings("deprecation")
            for load loadEditRecords. Does this not give a compile warning?

          Also, going into trunk is a good idea.

          Show
          Ivan Kelly added a comment - Im +1 on all these changes. One small question though. You're delta has: @SuppressWarnings("deprecation,unchecked") + @SuppressWarnings("deprecation") for load loadEditRecords. Does this not give a compile warning? Also, going into trunk is a good idea.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481622/hdfs-2003.txt
          against trunk revision 1132715.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.balancer.TestBalancer

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481622/hdfs-2003.txt against trunk revision 1132715. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.balancer.TestBalancer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/727//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -
          • Updated patch against trunk to include the new ReassignLeaseOp.
          • I filed HDFS-2041 for the findbugs warnings - turns out they are legit, but I want to keep this as a refactor and commit the bug fix right afterwards.

          For me, I was getting an error that the "unchecked" suppression wasn't necessary - where is there an unchecked cast warning?

          Show
          Todd Lipcon added a comment - Updated patch against trunk to include the new ReassignLeaseOp. I filed HDFS-2041 for the findbugs warnings - turns out they are legit, but I want to keep this as a refactor and commit the bug fix right afterwards. For me, I was getting an error that the "unchecked" suppression wasn't necessary - where is there an unchecked cast warning?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481642/hdfs-2003.txt
          against trunk revision 1132829.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI
          org.apache.hadoop.hdfs.server.balancer.TestBalancer

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481642/hdfs-2003.txt against trunk revision 1132829. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.balancer.TestBalancer +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/732//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Fixed the findbugs. I don't think its worth doing it in another JIRA as the changes for it are tiny and don't do very much. I commented the code out instead of removing so that it's clear what was meant to be read.

          The reason this didn't hit before was that timestamp and atime were being reused in FSEditLogLoader#loadEditRecords().

          Show
          Ivan Kelly added a comment - Fixed the findbugs. I don't think its worth doing it in another JIRA as the changes for it are tiny and don't do very much. I commented the code out instead of removing so that it's clear what was meant to be read. The reason this didn't hit before was that timestamp and atime were being reused in FSEditLogLoader#loadEditRecords().
          Hide
          Ivan Kelly added a comment -

          Ah, hadn't noticed you'd made a bit of code change in HDFS-2041. Ignore my last patch in that case.

          Show
          Ivan Kelly added a comment - Ah, hadn't noticed you'd made a bit of code change in HDFS-2041 . Ignore my last patch in that case.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481701/HDFS-2003.diff
          against trunk revision 1132881.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/733//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/733//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/733//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481701/HDFS-2003.diff against trunk revision 1132881. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/733//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/733//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/733//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          For reviewers, please review http://issues.apache.org/jira/secure/attachment/12481642/hdfs-2003.txt

          No tests included since this is a straight refactor. The findbugs warnings are fixed by HDFS-2041.

          Show
          Todd Lipcon added a comment - For reviewers, please review http://issues.apache.org/jira/secure/attachment/12481642/hdfs-2003.txt No tests included since this is a straight refactor. The findbugs warnings are fixed by HDFS-2041 .
          Hide
          Todd Lipcon added a comment -

          Did a self-review by going through the diff with the old code on the left and the new Op classes on the right. Two small notes, one of which is a real bug:

          • AddCloseOp has the following problem: in the old code, it would call fsNamesys.adjustReplication(), and then the adjusted replication would be passed to the readBlocks call. In the new code, the unadjusted replication is passed. So, the blocks end up with the wrong replication count.
          • We should rename SetGenstampOp.lw to something more meaningful (perhaps genStamp)
          Show
          Todd Lipcon added a comment - Did a self-review by going through the diff with the old code on the left and the new Op classes on the right. Two small notes, one of which is a real bug: AddCloseOp has the following problem: in the old code, it would call fsNamesys.adjustReplication(), and then the adjusted replication would be passed to the readBlocks call. In the new code, the unadjusted replication is passed. So, the blocks end up with the wrong replication count. We should rename SetGenstampOp.lw to something more meaningful (perhaps genStamp )
          Hide
          Todd Lipcon added a comment -

          It turns out that the bug mentioned above does not actually cause a problem. I added a unit test here to be sure of it – the BlockInfo objects do get created with a replication count that's too low, but before the triplets array is used, ensureCapacity will resize it appropriately.

          I think this patch is ready for commit if someone can please review.

          Show
          Todd Lipcon added a comment - It turns out that the bug mentioned above does not actually cause a problem. I added a unit test here to be sure of it – the BlockInfo objects do get created with a replication count that's too low, but before the triplets array is used, ensureCapacity will resize it appropriately. I think this patch is ready for commit if someone can please review.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481764/hdfs-2003.txt
          against trunk revision 1133225.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/738//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/738//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/738//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481764/hdfs-2003.txt against trunk revision 1133225. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/738//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/738//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/738//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Fixed the reading of blocks so that replication is adjusted before use. While it may not be strictly necessary, it's could introduce a real bug in the future, so its best to set the replication correctly from the outset.

          https://issues.apache.org/jira/secure/attachment/12481812/HDFS-2003.diff is the same as http://issues.apache.org/jira/secure/attachment/12481764/hdfs-2003.txt with HDFS-2003-replicationfix-delta.diff applied.

          Show
          Ivan Kelly added a comment - Fixed the reading of blocks so that replication is adjusted before use. While it may not be strictly necessary, it's could introduce a real bug in the future, so its best to set the replication correctly from the outset. https://issues.apache.org/jira/secure/attachment/12481812/HDFS-2003.diff is the same as http://issues.apache.org/jira/secure/attachment/12481764/hdfs-2003.txt with HDFS-2003 -replicationfix-delta.diff applied.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481812/HDFS-2003.diff
          against trunk revision 1133225.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI
          org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes
          org.apache.hadoop.hdfs.server.balancer.TestBalancer
          org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS
          org.apache.hadoop.hdfs.TestFileCreationNamenodeRestart
          org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage
          org.apache.hadoop.hdfs.TestLeaseRecovery2

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/739//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/739//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/739//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481812/HDFS-2003.diff against trunk revision 1133225. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS org.apache.hadoop.hdfs.TestFileCreationNamenodeRestart org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage org.apache.hadoop.hdfs.TestLeaseRecovery2 +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/739//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/739//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/739//console This message is automatically generated.
          Hide
          Ivan Kelly added a comment -

          Resubmitting patch fixing error which caused hudson failure. As reading in the Block and constructing the BlockInfo have been separated, Block objects need to be persisted. i.e. The same block object cant be reused for all the reads.

          @Todd could you give this a quick test with your benchmark to make sure it doesn't hit performance.

          Show
          Ivan Kelly added a comment - Resubmitting patch fixing error which caused hudson failure. As reading in the Block and constructing the BlockInfo have been separated, Block objects need to be persisted. i.e. The same block object cant be reused for all the reads. @Todd could you give this a quick test with your benchmark to make sure it doesn't hit performance.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481829/HDFS-2003.diff
          against trunk revision 1133225.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI
          org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/740//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/740//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/740//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481829/HDFS-2003.diff against trunk revision 1133225. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/740//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/740//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/740//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          @Todd could you give this a quick test with your benchmark to make sure it doesn't hit performance.

          Yep, ran the tests with latest patch and there's no discernible perf hit with the patch.

          Will look over your diff momentarily.

          Show
          Todd Lipcon added a comment - @Todd could you give this a quick test with your benchmark to make sure it doesn't hit performance. Yep, ran the tests with latest patch and there's no discernible perf hit with the patch. Will look over your diff momentarily.
          Hide
          Todd Lipcon added a comment -

          Looks pretty good. Small nits:

          • please use curly braces around the if and else blocks in the OP_ADD_BLOCK section, per style guidelines
          • in readBlocks, move the construction of BlockTwo inside the else clause, since it's only needed in old versions
          Show
          Todd Lipcon added a comment - Looks pretty good. Small nits: please use curly braces around the if and else blocks in the OP_ADD_BLOCK section, per style guidelines in readBlocks, move the construction of BlockTwo inside the else clause, since it's only needed in old versions
          Hide
          Ivan Kelly added a comment -

          Addressed the two things from Todd's previous comment.

          Show
          Ivan Kelly added a comment - Addressed the two things from Todd's previous comment.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12481897/HDFS-2003.diff
          against trunk revision 1133476.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 4 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.cli.TestHDFSCLI
          org.apache.hadoop.hdfs.TestHDFSTrash

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/749//testReport/
          Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/749//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/749//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481897/HDFS-2003.diff against trunk revision 1133476. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/749//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/749//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/749//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          +1, good stuff! I'll commit momentarily

          Show
          Todd Lipcon added a comment - +1, good stuff! I'll commit momentarily
          Hide
          Todd Lipcon added a comment -

          Committed to trunk, thanks Ivan! I will try to merge this into the 1073 branch this afternoon or evening.

          Show
          Todd Lipcon added a comment - Committed to trunk, thanks Ivan! I will try to merge this into the 1073 branch this afternoon or evening.
          Hide
          Todd Lipcon added a comment -

          Updating fix versions, this was done in trunk as well.

          Show
          Todd Lipcon added a comment - Updating fix versions, this was done in trunk as well.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #699 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #699 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/ )

            People

            • Assignee:
              Ivan Kelly
              Reporter:
              Ivan Kelly
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development