Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1936

Updating the layout version from HDFS-1822 causes upgrade problems.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.22.0, 0.23.0
    • Fix Version/s: 0.22.0, 0.23.0
    • Component/s: namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk were changed. Some of the namenode logic that depends on layout version is broken because of this. Read the comment for more description.

      1. hadoop-22-dfs-dir.tgz
        173 kB
        Todd Lipcon
      2. HDFS-1936.3.patch
        50 kB
        Suresh Srinivas
      3. HDFS-1936.4.patch
        51 kB
        Suresh Srinivas
      4. HDFS-1936.6.patch
        50 kB
        Suresh Srinivas
      5. HDFS-1936.6.patch
        50 kB
        Suresh Srinivas
      6. HDFS-1936.7.patch
        51 kB
        Suresh Srinivas
      7. HDFS-1936.8.patch
        51 kB
        Suresh Srinivas
      8. HDFS-1936.9.patch
        50 kB
        Suresh Srinivas
      9. HDFS-1936.rel22.patch
        27 kB
        Suresh Srinivas
      10. HDFS-1936.trunk.patch
        13 kB
        Suresh Srinivas
      11. hdfs-1936-with-testcase.txt
        18 kB
        Todd Lipcon

        Issue Links

          Activity

          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-22-branch #61 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/61/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-22-branch #61 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/61/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #704 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/704/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #704 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/704/ )
          Suresh Srinivas made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Hide
          Suresh Srinivas added a comment -

          Agreed we should more upgrade tests.

          I committed the patch to both trunk and 0.22.

          Show
          Suresh Srinivas added a comment - Agreed we should more upgrade tests. I committed the patch to both trunk and 0.22.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #685 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/685/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #685 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/685/ )
          Hide
          Todd Lipcon added a comment -

          +1 on the 0.22 patch.

          We should probably add an 0.20.0 and 0.20.203 image tarball to these tests, too, given we have the infrastructure, but we can do that separately for sure.

          Show
          Todd Lipcon added a comment - +1 on the 0.22 patch. We should probably add an 0.20.0 and 0.20.203 image tarball to these tests, too, given we have the infrastructure, but we can do that separately for sure.
          Hide
          Suresh Srinivas added a comment -

          I ran unit tests on 0.22 patch. The following tests fail - TestHDFSTrash, TestMissingBlocksAlert. The first is a know failure, the second one could be from HDFS-1954?

          Todd, can you please review the 0.22 patch?

          Show
          Suresh Srinivas added a comment - I ran unit tests on 0.22 patch. The following tests fail - TestHDFSTrash, TestMissingBlocksAlert. The first is a know failure, the second one could be from HDFS-1954 ? Todd, can you please review the 0.22 patch?
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480997/HDFS-1936.rel22.patch
          against trunk revision 1129831.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/668//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480997/HDFS-1936.rel22.patch against trunk revision 1129831. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/668//console This message is automatically generated.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.rel22.patch [ 12480997 ]
          Hide
          Suresh Srinivas added a comment -

          Attaching the 0.22 version of the patch.

          Show
          Suresh Srinivas added a comment - Attaching the 0.22 version of the patch.
          Hide
          Todd Lipcon added a comment -

          ah, sorry, I had meant +1 above, just to wait to make sure Hudson was OK.

          So, official +1

          Show
          Todd Lipcon added a comment - ah, sorry, I had meant +1 above, just to wait to make sure Hudson was OK. So, official +1
          Hide
          Suresh Srinivas added a comment -

          Todd, I am waiting for your +1 to commit this change. I will also have 0.22 version of the patch.

          Show
          Suresh Srinivas added a comment - Todd, I am waiting for your +1 to commit this change. I will also have 0.22 version of the patch.
          Hide
          Suresh Srinivas added a comment -

          Interestingly hudson can reproduce the test failure - but no logs due to test timeout

          Show
          Suresh Srinivas added a comment - Interestingly hudson can reproduce the test failure - but no logs due to test timeout
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480985/HDFS-1936.9.patch
          against trunk revision 1129831.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 16 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480985/HDFS-1936.9.patch against trunk revision 1129831. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/667//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          hmm, I was able to reproduce it reliably the other day, today it's not cooperating. I'll loop it until it fails.

          Show
          Todd Lipcon added a comment - hmm, I was able to reproduce it reliably the other day, today it's not cooperating. I'll loop it until it fails.
          Hide
          Suresh Srinivas added a comment -

          Todd, if you can reproduce the TestDFSUpgradeFromImage failure (I cannot on my machine), can you load the logs to HDFS-2020.

          Show
          Suresh Srinivas added a comment - Todd, if you can reproduce the TestDFSUpgradeFromImage failure (I cannot on my machine), can you load the logs to HDFS-2020 .
          Hide
          Todd Lipcon added a comment -

          +1 pending Hudson. Thanks for being patient with my quibbling

          Show
          Todd Lipcon added a comment - +1 pending Hudson. Thanks for being patient with my quibbling
          Suresh Srinivas made changes -
          Attachment HDFS-1936.9.patch [ 12480985 ]
          Hide
          Suresh Srinivas added a comment -

          Reverted the DataNode.java changes in the new version of the patch.

          Show
          Suresh Srinivas added a comment - Reverted the DataNode.java changes in the new version of the patch.
          Hide
          Suresh Srinivas added a comment -

          > I know it's been a singleton for a while, but with federation, it seems to cause a lot more problems – this test didn't ever time out before. So, I think the singleton issue is a blocker, not just code cleanup, right?

          The problem may be more pronounced with federation, but you see that this is a problem for MiniDFSCluster. Are you suggesting this bug needs to be fixed in this jira. I think it is unrelated to what this jira is addressing - that is layout versions changed in HDFS-1822 causes upgrade issues. Will create a separate jira as I have said earlier.

          I think the new jira should be a blocker for 0.23 and not 0.22 (where by lucky coincidence the test works).

          > I think the null checks you had to add are also due to this singleton usage – they're masking an issue but don't inherently make sense.
          Not sure what you are getting at here. This is an initialization issue that I was addressing. To make progress on this issue, I can revert that part.

          Show
          Suresh Srinivas added a comment - > I know it's been a singleton for a while, but with federation, it seems to cause a lot more problems – this test didn't ever time out before. So, I think the singleton issue is a blocker, not just code cleanup, right? The problem may be more pronounced with federation, but you see that this is a problem for MiniDFSCluster. Are you suggesting this bug needs to be fixed in this jira. I think it is unrelated to what this jira is addressing - that is layout versions changed in HDFS-1822 causes upgrade issues. Will create a separate jira as I have said earlier. I think the new jira should be a blocker for 0.23 and not 0.22 (where by lucky coincidence the test works). > I think the null checks you had to add are also due to this singleton usage – they're masking an issue but don't inherently make sense. Not sure what you are getting at here. This is an initialization issue that I was addressing. To make progress on this issue, I can revert that part.
          Hide
          Todd Lipcon added a comment -

          Hi Suresh. I know it's been a singleton for a while, but with federation, it seems to cause a lot more problems – this test didn't ever time out before. So, I think the singleton issue is a blocker, not just code cleanup, right?

          I think the null checks you had to add are also due to this singleton usage – they're masking an issue but don't inherently make sense.

          Show
          Todd Lipcon added a comment - Hi Suresh. I know it's been a singleton for a while, but with federation, it seems to cause a lot more problems – this test didn't ever time out before. So, I think the singleton issue is a blocker, not just code cleanup, right? I think the null checks you had to add are also due to this singleton usage – they're masking an issue but don't inherently make sense.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480885/HDFS-1936.8.patch
          against trunk revision 1128987.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 16 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480885/HDFS-1936.8.patch against trunk revision 1128987. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/660//console This message is automatically generated.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.8.patch [ 12480885 ]
          Hide
          Suresh Srinivas added a comment -

          Fixing javadoc warning.

          Show
          Suresh Srinivas added a comment - Fixing javadoc warning.
          Hide
          Suresh Srinivas added a comment -

          I plan on addressing that in another jira. It is a singleton in 0.22 also.

          Show
          Suresh Srinivas added a comment - I plan on addressing that in another jira. It is a singleton in 0.22 also.
          Hide
          Todd Lipcon added a comment -

          Oh... I remember now... I already mentioned this above - it's the race condition since DN is being treated like a singleton. Unfortunately I think we need to fix this.

          Show
          Todd Lipcon added a comment - Oh... I remember now... I already mentioned this above - it's the race condition since DN is being treated like a singleton. Unfortunately I think we need to fix this.
          Hide
          Todd Lipcon added a comment -

          Hey Suresh. It's strange, this patch seems to pass half the time, and the other half the time the cluster gets stuck starting up and the test times out. Are you seeing the same thing?

          Show
          Todd Lipcon added a comment - Hey Suresh. It's strange, this patch seems to pass half the time, and the other half the time the cluster gets stuck starting up and the test times out. Are you seeing the same thing?
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #680 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/680/)
          HDFS-1936. Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball.
          HDFS-1936. Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball.

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128534
          Files :

          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128527
          Files :

          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/common/Storage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/protocol/TestLayoutVersion.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #680 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/680/ ) HDFS-1936 . Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball. HDFS-1936 . Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball. suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128534 Files : /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128527 Files : /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/common/Storage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/protocol/TestLayoutVersion.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480730/HDFS-1936.7.patch
          against trunk revision 1128542.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 16 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480730/HDFS-1936.7.patch against trunk revision 1128542. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/655//console This message is automatically generated.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.7.patch [ 12480730 ]
          Hide
          Suresh Srinivas added a comment -

          Minor build.xml change to copy the 22 image tar ball to test directory

          Show
          Suresh Srinivas added a comment - Minor build.xml change to copy the 22 image tar ball to test directory
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480723/HDFS-1936.6.patch
          against trunk revision 1128534.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 15 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSUpgradeFromImage

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480723/HDFS-1936.6.patch against trunk revision 1128534. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSUpgradeFromImage +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/653//console This message is automatically generated.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.6.patch [ 12480723 ]
          Hide
          Suresh Srinivas added a comment -

          I was not sure how you generated the image tar ball. Will resurrect the check.

          > Looks like some spurious changes to DataNode.java
          I saw null pointer exceptions in that part of the code. Hence the check during upgrade tests.

          > Indentation slightly off in loadLocalNameINodes
          loadLocalNameINodes method is off in indentation (3 spaces instead of two). Any way I have aligned the new line to the existing code.

          Show
          Suresh Srinivas added a comment - I was not sure how you generated the image tar ball. Will resurrect the check. > Looks like some spurious changes to DataNode.java I saw null pointer exceptions in that part of the code. Hence the check during upgrade tests. > Indentation slightly off in loadLocalNameINodes loadLocalNameINodes method is off in indentation (3 spaces instead of two). Any way I have aligned the new line to the existing code.
          Hide
          Suresh Srinivas added a comment -

          I have reverted the previous commit! svn commit with just the file in the commit and rest of the changes still in the dev environment, still commits all the changes in the dev environment

          Show
          Suresh Srinivas added a comment - I have reverted the previous commit! svn commit with just the file in the commit and rest of the changes still in the dev environment, still commits all the changes in the dev environment
          Suresh Srinivas made changes -
          Attachment HDFS-1936.6.patch [ 12480722 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #693 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/693/)
          HDFS-1936. Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball.
          HDFS-1936. Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball.

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128534
          Files :

          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz

          suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128527
          Files :

          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/common/Storage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/protocol/TestLayoutVersion.java
          • /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java
          • /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #693 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/693/ ) HDFS-1936 . Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball. HDFS-1936 . Part 1 or 2 - Updating the layout version from HDFS-1822 causes upgrade problems. Committing the required image tar ball. suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128534 Files : /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1128527 Files : /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/common/Storage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/EditsLoaderCurrent.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/UpgradeUtilities.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/OfflineImageViewer.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/hadoop-22-dfs-dir.tgz /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/protocol/FSConstants.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/protocol/TestLayoutVersion.java /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
          Hide
          Todd Lipcon added a comment -

          Sounds good to me.

          Show
          Todd Lipcon added a comment - Sounds good to me.
          Hide
          Suresh Srinivas added a comment -

          Given the result of the test so far, I am going to commit the 0.22 image tar ball and kick off the tests with the new patch.

          Show
          Suresh Srinivas added a comment - Given the result of the test so far, I am going to commit the 0.22 image tar ball and kick off the tests with the new patch.
          Hide
          Suresh Srinivas added a comment -

          > looks like you caught the bottom of my comments - others still stand on the latest patch
          My latest patch was to fix the test failures and not to address your comments. Will post a new patch.

          Show
          Suresh Srinivas added a comment - > looks like you caught the bottom of my comments - others still stand on the latest patch My latest patch was to fix the test failures and not to address your comments. Will post a new patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480682/HDFS-1936.4.patch
          against trunk revision 1128393.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 17 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.server.namenode.TestBlocksWithNotEnoughRacks

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480682/HDFS-1936.4.patch against trunk revision 1128393. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 17 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.namenode.TestBlocksWithNotEnoughRacks +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/649//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          looks like you caught the bottom of my comments - others still stand on the latest patch

          Show
          Todd Lipcon added a comment - looks like you caught the bottom of my comments - others still stand on the latest patch
          Suresh Srinivas made changes -
          Attachment HDFS-1936.4.patch [ 12480682 ]
          Hide
          Suresh Srinivas added a comment -

          Minor bug fix to fix test failures.

          Show
          Suresh Srinivas added a comment - Minor bug fix to fix test failures.
          Hide
          Todd Lipcon added a comment -

          Wow, this is so much improved!

          • I'm curious why you've changed the test to no longer verify the contents of the filesystem after upgrade for the 0.22 case? The way I generated the 0.22 tarball attached above was to do an upgrade from the 0.14 one, so all of the checksums should be in tact. This will also help ensure that the DN upgrade process works with federation.
          • In this code, do we need the if statement? Seems like it would be an error to call this before the feature map is initialized.
            +  private static void specialInit(int lv, Feature f) {
            +    EnumSet<Feature> set = map.get(lv);
            +    if (set != null) {
            +      set.add(f);
            +    }
            +  }
            
          • Looks like some spurious changes to DataNode.java?
          • Indentation slightly off in loadLocalNameINodes
          • This diff looks not quite right - you're checking if FSConstants.LAYOUT_VERSION supports federation, but that's always true in trunk. So, this code will always run. I think you want to check !LayoutVersion.supports(Feature.FEDERATION, storage.getLayoutVersion()) maybe?
            -    if (startOpt == StartupOption.UPGRADE
            -        && storage.getLayoutVersion() > Storage.LAST_PRE_FEDERATION_LAYOUT_VERSION) {
            +    if (startOpt == StartupOption.UPGRADE && 
            +        LayoutVersion.supports(Feature.FEDERATION, FSConstants.LAYOUT_VERSION)) {
            
          Show
          Todd Lipcon added a comment - Wow, this is so much improved! I'm curious why you've changed the test to no longer verify the contents of the filesystem after upgrade for the 0.22 case? The way I generated the 0.22 tarball attached above was to do an upgrade from the 0.14 one, so all of the checksums should be in tact. This will also help ensure that the DN upgrade process works with federation. In this code, do we need the if statement? Seems like it would be an error to call this before the feature map is initialized. + private static void specialInit( int lv, Feature f) { + EnumSet<Feature> set = map.get(lv); + if (set != null ) { + set.add(f); + } + } Looks like some spurious changes to DataNode.java? Indentation slightly off in loadLocalNameINodes This diff looks not quite right - you're checking if FSConstants.LAYOUT_VERSION supports federation, but that's always true in trunk. So, this code will always run. I think you want to check !LayoutVersion.supports(Feature.FEDERATION, storage.getLayoutVersion()) maybe? - if (startOpt == StartupOption.UPGRADE - && storage.getLayoutVersion() > Storage.LAST_PRE_FEDERATION_LAYOUT_VERSION) { + if (startOpt == StartupOption.UPGRADE && + LayoutVersion.supports(Feature.FEDERATION, FSConstants.LAYOUT_VERSION)) {
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480671/HDFS-1936.3.patch
          against trunk revision 1128393.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 17 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestDFSRollback
          org.apache.hadoop.hdfs.TestDFSUpgrade

          +1 contrib tests. The patch passed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480671/HDFS-1936.3.patch against trunk revision 1128393. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 17 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSRollback org.apache.hadoop.hdfs.TestDFSUpgrade +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/647//console This message is automatically generated.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.22.patch [ 12479156 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.2.patch [ 12480662 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.1.patch [ 12480628 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.3.patch [ 12480671 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.3.patch [ 12480667 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.3.patch [ 12480667 ]
          Hide
          Suresh Srinivas added a comment -

          Changes:

          1. LayoutVersion.java is added to provide:
            • Cleaner check instead of comparing layout versions. Every feature that changes layout version is added to the list of features, with the corresponding layout version. In case a layout version does not have immediate predecessor as the ancestor, a different ancestor can be added.
            • Only the layout versions starting from -16 is added. One could add all the other layout versions. But that could happen in another jira.
            • Changed all the code that compares layout versions, starting from -16, to check if a feature is supported.
            • In case of editlogs, there is no need to check for the layout version before processing an opcode. Either the current software understands it or not.
          2. TestDFSUpgraeFromImage now tests upgrade from 0.22 image. I have commented this test. Once the first hudson test runs fine, I will check in the 0.22 image first and then run the hudson tests again.
          Show
          Suresh Srinivas added a comment - Changes: LayoutVersion.java is added to provide: Cleaner check instead of comparing layout versions. Every feature that changes layout version is added to the list of features, with the corresponding layout version. In case a layout version does not have immediate predecessor as the ancestor, a different ancestor can be added. Only the layout versions starting from -16 is added. One could add all the other layout versions. But that could happen in another jira. Changed all the code that compares layout versions, starting from -16, to check if a feature is supported. In case of editlogs, there is no need to check for the layout version before processing an opcode. Either the current software understands it or not. TestDFSUpgraeFromImage now tests upgrade from 0.22 image. I have commented this test. Once the first hudson test runs fine, I will check in the 0.22 image first and then run the hudson tests again.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12480662/HDFS-1936.2.patch
          against trunk revision 1128009.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 16 new or modified tests.

          -1 javadoc. The javadoc tool appears to have generated 1 warning messages.

          -1 javac. The patch appears to cause tar ant target to fail.

          -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:

          -1 contrib tests. The patch failed contrib unit tests.

          -1 system test framework. The patch failed system test framework compile.

          Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//testReport/
          Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12480662/HDFS-1936.2.patch against trunk revision 1128009. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 16 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 1 warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/644//console This message is automatically generated.
          Suresh Srinivas made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Suresh Srinivas made changes -
          Attachment HDFS-1936.2.patch [ 12480662 ]
          Hide
          Suresh Srinivas added a comment -

          I will add description of the change shortly. Submitting to kick off hudson build.

          Show
          Suresh Srinivas added a comment - I will add description of the change shortly. Submitting to kick off hudson build.
          Suresh Srinivas made changes -
          Attachment HDFS-1936.1.patch [ 12480628 ]
          Hide
          Suresh Srinivas added a comment -

          Todd, here is the patch I have so far. This does not fix the upgrade failure yet. Can you look at this early version of the patch, since it is not likely to change when upgrade issues are fixed.

          Show
          Suresh Srinivas added a comment - Todd, here is the patch I have so far. This does not fix the upgrade failure yet. Can you look at this early version of the patch, since it is not likely to change when upgrade issues are fixed.
          Hide
          Suresh Srinivas added a comment -

          > imgVersion > -23
          This is a bug in the existing code. Delegation token was added in -24. So the check should be > -24.

          > Removed // added in -23
          I would revert this. The comments are replete with when a particular field is added. I prefer not adding these details to comments, we can keep that documentation really up to date.

          > DataNode.getUpgradeManagerDatanode, which treats the DN like a singleton...
          This is very similar to what is in 22. 22 uses a static method as well. I will look into this. If this is not closely related to this change, will address it in a separate bug.

          Show
          Suresh Srinivas added a comment - > imgVersion > -23 This is a bug in the existing code. Delegation token was added in -24. So the check should be > -24. > Removed // added in -23 I would revert this. The comments are replete with when a particular field is added. I prefer not adding these details to comments, we can keep that documentation really up to date. > DataNode.getUpgradeManagerDatanode, which treats the DN like a singleton... This is very similar to what is in 22. 22 uses a static method as well. I will look into this. If this is not closely related to this change, will address it in a separate bug.
          Hide
          Todd Lipcon added a comment -

          I think the above trace is actually a couple traces intermingled. But looking through the code a bit, it seems to be due to the fact that verifyDistributedUpgradeProgress is calling out to a static function DataNode.getUpgradeManagerDatanode, which treats the DN like a singleton. This obviously doesn't play well with a MiniDFSCluster when there are multiple DNs!

          Show
          Todd Lipcon added a comment - I think the above trace is actually a couple traces intermingled. But looking through the code a bit, it seems to be due to the fact that verifyDistributedUpgradeProgress is calling out to a static function DataNode.getUpgradeManagerDatanode, which treats the DN like a singleton. This obviously doesn't play well with a MiniDFSCluster when there are multiple DNs!
          Todd Lipcon made changes -
          Attachment hdfs-1936-with-testcase.txt [ 12480079 ]
          Attachment hadoop-22-dfs-dir.tgz [ 12480080 ]
          Hide
          Todd Lipcon added a comment -

          Here's a tarball of an 0.22 data directory and test case which tests upgrading from it.

          Without Suresh's patch (which is included in this one) it failed every time. Now it passes sometimes and other times it times out. I can see the following NPE in the logs:

          [junit] 2011-05-22 20:09:43,246 WARN datanode.DataNode (DataNode.java:run(1213)) - Unexpected exception
          [junit] java.lang.NullPointerException
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode.getUpgradeManagerDatanode(DataNode.java:1728)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.verifyDistributedUpgradeProgress(DataStorage.java:727)
          [junit] Exception in thread "DataNode: file:/home/todd/git/hadoop-hdfs/build/test/data/dfs/data/data1/,file:/home/todd/git/hadoop-hdfs/build/test/data/dfs/data/data2/" java.lang.NullPointerException
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:394)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.cleanUp(DataNode.java:1002)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:189)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.run(DataNode.java:1217)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:216)
          [junit] at java.lang.Thread.run(Thread.java:662)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.setupBPStorage(DataNode.java:796)
          [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.setupBP(DataNode.java:773)

          Show
          Todd Lipcon added a comment - Here's a tarball of an 0.22 data directory and test case which tests upgrading from it. Without Suresh's patch (which is included in this one) it failed every time. Now it passes sometimes and other times it times out. I can see the following NPE in the logs: [junit] 2011-05-22 20:09:43,246 WARN datanode.DataNode (DataNode.java:run(1213)) - Unexpected exception [junit] java.lang.NullPointerException [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode.getUpgradeManagerDatanode(DataNode.java:1728) [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.verifyDistributedUpgradeProgress(DataStorage.java:727) [junit] Exception in thread "DataNode: file:/home/todd/git/hadoop-hdfs/build/test/data/dfs/data/data1/,file:/home/todd/git/hadoop-hdfs/build/test/data/dfs/data/data2/ " java.lang.NullPointerException [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:394) [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.cleanUp(DataNode.java:1002) [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:189) [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.run(DataNode.java:1217) [junit] at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:216) [junit] at java.lang.Thread.run(Thread.java:662) [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.setupBPStorage(DataNode.java:796) [junit] at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.setupBP(DataNode.java:773)
          Hide
          Todd Lipcon added a comment -

          I think this looks good. Here are two comments:

          • not sure I follow this - this is just a fix of an incorrect check in a previous version?
                 private void loadSecretManagerState(DataInputStream in) throws IOException {
            -      if (imgVersion > -23) {
            +      if (imgVersion > -24) {
                     //SecretManagerState is not available.
                     //This must not happen if security is turned on.
            
          • why this change?
              *      OctalPerms (short -> String)  // Modified in -19
            - *    Symlink (String) // added in -23
            + *    Symlink (String)
              * NumINodesUnderConstruction (int)
            

          The PRE_RBW_LAYOUT_VERSION is a good point. I didn't actually notice that but it looks like you're right that it's not a correct check since it misses the 20x branches.

          Show
          Todd Lipcon added a comment - I think this looks good. Here are two comments: not sure I follow this - this is just a fix of an incorrect check in a previous version? private void loadSecretManagerState(DataInputStream in) throws IOException { - if (imgVersion > -23) { + if (imgVersion > -24) { //SecretManagerState is not available. //This must not happen if security is turned on. why this change? * OctalPerms ( short -> String ) // Modified in -19 - * Symlink ( String ) // added in -23 + * Symlink ( String ) * NumINodesUnderConstruction ( int ) The PRE_RBW_LAYOUT_VERSION is a good point. I didn't actually notice that but it looks like you're right that it's not a correct check since it misses the 20x branches.
          Hide
          Suresh Srinivas added a comment -

          You might have been talking about this - Storage#PRE_RBW_LAYOUT_VERSION is not handled correctly. For this, we may need to add complicated check - where we have to check -24 and later than -33.

          I think we could do this for trunk with isXSupported() in this jira. But I am reluctant to make that big a change for 0.22 and wanted to get this done with smaller patch.

          Show
          Suresh Srinivas added a comment - You might have been talking about this - Storage#PRE_RBW_LAYOUT_VERSION is not handled correctly. For this, we may need to add complicated check - where we have to check -24 and later than -33. I think we could do this for trunk with isXSupported() in this jira. But I am reluctant to make that big a change for 0.22 and wanted to get this done with smaller patch.
          Hide
          Suresh Srinivas added a comment -

          BTW I did not describe this earlier - I am removing the check for LV while consuming operations from editlog. Opcodes are suppose to be unique (at least that is the case right now) and there is no need to check for LV when the opcode was added.

          This simplifies some of the conflict you might have been thinking about. Please look at the patch.

          Show
          Suresh Srinivas added a comment - BTW I did not describe this earlier - I am removing the check for LV while consuming operations from editlog. Opcodes are suppose to be unique (at least that is the case right now) and there is no need to check for LV when the opcode was added. This simplifies some of the conflict you might have been thinking about. Please look at the patch.
          Hide
          Suresh Srinivas added a comment -

          > It seems that this patch might break upgrade from 0.21?
          Can you describe how?

          > if we did the refactor now so that we can just encode that compression is in -25 through -30 and -35 onwards
          I think making -25 as -33 makes sense right? Instead of this complicated check. Do you see a problem with that?

          Show
          Suresh Srinivas added a comment - > It seems that this patch might break upgrade from 0.21? Can you describe how? > if we did the refactor now so that we can just encode that compression is in -25 through -30 and -35 onwards I think making -25 as -33 makes sense right? Instead of this complicated check. Do you see a problem with that?
          Hide
          Todd Lipcon added a comment -

          I compiled the following list of versions:

            // -35: Adding support for block pools and multiple namenodes (federation)
            // -34: no new feature - just bumped after following three were reserved
            // The following three image versions are special out-of-sequence allocations
            // (i.e. they don't necessarily contain all features from "earlier" versions)
            // See HDFS-1822, HDFS-1842, 
            //   -33:    0.22.0
            //   -32:    0.20.204
            //   -31:    0.20.203
            // -30: HDFS-1070: store only last component of a path in image
            // -29: unused (skipped for no reason)
            // -28: HDFS-1630: fsedits are checksummed
            // -27: HDFS-259: remove intentionally corrupt pre-0.13 image directory
            // -26: image checksum
            // -25: image compression
            // -24: added OP_*_DELEGATION_TOKEN and OP_UPDATE_MASTER_KEY   *** 0.21.0 ***
            // -23: symlinks
            // -22: OP_CONCAT_DELETE
            // -21: atomic rename function
            // -20: datanode has a "rbw" subdirectory (append impl in 0.21)
            // -19: sticky bit                         *** CDH3, earlier 0.20.203 ***
            // -18: support disk space quotas          *** 0.20.2 ***
            // -17: support access time on files
            // -16: change edit log and fsimage to support quotas
            // -15: store generation stamp with each block
          
            // The special versions -31, -32, -33 are as follows:
            // -31: corresponds to release 0.20.203:
            //   * Includes functionality up to version -19
            //   * includes delegation token opcodes from -24, BUT
            //     with different opcode identifiers. Given this,
            //     we don't allow upgrade from this version.
            // -32:
            //   * Includes same set of features as version -31
            //   * Distinction is that the opcodes match trunk
            //
            // -33: 0.22.0
            //   * contains all features up through version -27
          

          It seems that this patch might break upgrade from 0.21?

          Rather than trying to force all the checks to particular "release boundaries", I really think it would be easier to follow if we did the refactor now so that we can just encode that compression is in -25 through -30 and -35 onwards?

          Show
          Todd Lipcon added a comment - I compiled the following list of versions: // -35: Adding support for block pools and multiple namenodes (federation) // -34: no new feature - just bumped after following three were reserved // The following three image versions are special out-of-sequence allocations // (i.e. they don't necessarily contain all features from "earlier" versions) // See HDFS-1822, HDFS-1842, // -33: 0.22.0 // -32: 0.20.204 // -31: 0.20.203 // -30: HDFS-1070: store only last component of a path in image // -29: unused (skipped for no reason) // -28: HDFS-1630: fsedits are checksummed // -27: HDFS-259: remove intentionally corrupt pre-0.13 image directory // -26: image checksum // -25: image compression // -24: added OP_*_DELEGATION_TOKEN and OP_UPDATE_MASTER_KEY *** 0.21.0 *** // -23: symlinks // -22: OP_CONCAT_DELETE // -21: atomic rename function // -20: datanode has a "rbw" subdirectory (append impl in 0.21) // -19: sticky bit *** CDH3, earlier 0.20.203 *** // -18: support disk space quotas *** 0.20.2 *** // -17: support access time on files // -16: change edit log and fsimage to support quotas // -15: store generation stamp with each block // The special versions -31, -32, -33 are as follows: // -31: corresponds to release 0.20.203: // * Includes functionality up to version -19 // * includes delegation token opcodes from -24, BUT // with different opcode identifiers. Given this, // we don't allow upgrade from this version. // -32: // * Includes same set of features as version -31 // * Distinction is that the opcodes match trunk // // -33: 0.22.0 // * contains all features up through version -27 It seems that this patch might break upgrade from 0.21? Rather than trying to force all the checks to particular "release boundaries", I really think it would be easier to follow if we did the refactor now so that we can just encode that compression is in -25 through -30 and -35 onwards?
          Hide
          Todd Lipcon added a comment -

          Hi Suresh, sorry for the delay. I will try to get to this today - it's a little messy to review - lots of different numbers flying by. I think it might actually be easier to review to sequence this after the "isXSupported()" refactor?

          Show
          Todd Lipcon added a comment - Hi Suresh, sorry for the delay. I will try to get to this today - it's a little messy to review - lots of different numbers flying by. I think it might actually be easier to review to sequence this after the "isXSupported()" refactor?
          Hide
          Suresh Srinivas added a comment -

          Todd, can you let me know if you are very busy and cannot review the patch? I will work with others.

          Show
          Suresh Srinivas added a comment - Todd, can you let me know if you are very busy and cannot review the patch? I will work with others.
          Hide
          Suresh Srinivas added a comment -

          Todd, can you review the patch?

          Show
          Suresh Srinivas added a comment - Todd, can you review the patch?
          Suresh Srinivas made changes -
          Attachment HDFS-1936.22.patch [ 12479156 ]
          Attachment HDFS-1936.trunk.patch [ 12479157 ]
          Hide
          Suresh Srinivas added a comment -

          Patch with changes proposed in the jira.

          For trunk I will further clean this up in a separate jira to add isFeatureSupported() check and remove checking for versions.

          Show
          Suresh Srinivas added a comment - Patch with changes proposed in the jira. For trunk I will further clean this up in a separate jira to add isFeatureSupported() check and remove checking for versions.
          Hide
          Todd Lipcon added a comment -

          Nice idea with the EnumSet

          Show
          Todd Lipcon added a comment - Nice idea with the EnumSet
          Hide
          Suresh Srinivas added a comment -

          > Can I suggest that we add some static helper methods here
          I plan to add that.

          > Medium term maybe we can move away from the single-integer versioning to something like a list of flags
          I think layout version is referenced in various places. We cannot replace that. But I agree with you on adding a feature matrix, that some looks some thing like:
          LV <-> EnumSet of features supported.

          With that we could move to using feature supported for doing the functionality that we depend on LV today. This also enables building a simple tool which given two layout versions, can tell if upgrade is possible.

          Show
          Suresh Srinivas added a comment - > Can I suggest that we add some static helper methods here I plan to add that. > Medium term maybe we can move away from the single-integer versioning to something like a list of flags I think layout version is referenced in various places. We cannot replace that. But I agree with you on adding a feature matrix, that some looks some thing like: LV <-> EnumSet of features supported. With that we could move to using feature supported for doing the functionality that we depend on LV today. This also enables building a simple tool which given two layout versions, can tell if upgrade is possible.
          Hide
          Todd Lipcon added a comment -

          good digging. Can I suggest that we add some static helper methods here, something like FSImageFormat.versionSupportsFSImageCompression(layoutVersion) rather than the hardcoded version numbers we've got all over?

          Medium term maybe we can move away from the single-integer versioning to something like a list of flags? eg

          supported_features={image_compression,edits_checksum,image_checksum,delegation_token_ops,...} 

          . That would have gotten us out of this mess somewhat, right?

          Show
          Todd Lipcon added a comment - good digging. Can I suggest that we add some static helper methods here, something like FSImageFormat.versionSupportsFSImageCompression(layoutVersion) rather than the hardcoded version numbers we've got all over? Medium term maybe we can move away from the single-integer versioning to something like a list of flags? eg supported_features={image_compression,edits_checksum,image_checksum,delegation_token_ops,...} . That would have gotten us out of this mess somewhat, right?
          Suresh Srinivas made changes -
          Link This issue blocks HDFS-1822 [ HDFS-1822 ]
          Suresh Srinivas made changes -
          Link This issue blocks HDFS-1842 [ HDFS-1842 ]
          Suresh Srinivas made changes -
          Summary Updating the layout version from HDFS-1822 causes problems where logic depends on layout version Updating the layout version from HDFS-1822 causes upgrade problems.
          Hide
          Suresh Srinivas added a comment -

          The changes were:

          • 0.20.203 moved from LV -19 to -31
          • 0.20.204 moved to LV -32
          • 0.22 from -27 to -33
          • Trunk to -34

          This results in the following problems (all ranges are inclusive):

          1. Functionality added from LV -20 to -30 are not in 0.20.203 release. LV -28 to -32 are not in 0.22.
          2. As an example take FSImage compression added in LV -25. When upgrading to trunk from 0.20.203(LV -31), namenode code expects FSImage compression to be available (because -31 is later LV than -25). This functionality is in 0.22 and trunk and not in 0.20.203. Hence the upgrade fails.

          Solution:

          1. Change the checks for LV in 0.22 for ranges -20 to -27 to -33.
          2. Change the checks for LV in trunk for ranges -28 to -33 to -34.
          Show
          Suresh Srinivas added a comment - The changes were: 0.20.203 moved from LV -19 to -31 0.20.204 moved to LV -32 0.22 from -27 to -33 Trunk to -34 This results in the following problems (all ranges are inclusive): Functionality added from LV -20 to -30 are not in 0.20.203 release. LV -28 to -32 are not in 0.22. As an example take FSImage compression added in LV -25. When upgrading to trunk from 0.20.203(LV -31), namenode code expects FSImage compression to be available (because -31 is later LV than -25). This functionality is in 0.22 and trunk and not in 0.20.203. Hence the upgrade fails. Solution: Change the checks for LV in 0.22 for ranges -20 to -27 to -33. Change the checks for LV in trunk for ranges -28 to -33 to -34.
          Todd Lipcon made changes -
          Field Original Value New Value
          Description In HDFS_1822 and HDFS_1842, the layout versions for 203, 204, 22 and trunk were changed. Some of the namenode logic that depends on layout version is broken because of this. Read the comment for more description. In HDFS-1822 and HDFS-1842, the layout versions for 203, 204, 22 and trunk were changed. Some of the namenode logic that depends on layout version is broken because of this. Read the comment for more description.
          Suresh Srinivas created issue -

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development