Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3677

Problems with generation stamp upgrade

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.18.0
    • 0.18.0
    • None
    • None
    • Simplify generation stamp upgrade by making is a local upgrade on datandodes. Deleted distributed upgrade.

    Description

      1. The generation stamp upgrade renames blocks' meta-files so that the name contains the block generation stamp as stated in HADOOP-2656.
        If a data-node has blocks that do not belong to any files and the name-node asks the data-node to remove those blocks
        during or before the upgrade started the data-node will remove the blocks but not the meta-files because their names
        are still in the old format which is not recognized by the new code. So we can end up with a number of garbage files which
        will be hard to recognize that they are unused and the system will never remove them automatically.
        I think this should be handled by the upgrade code in the end, but may be it will be right to fix HADOOP-3002 for the 0.18 release,
        which will avoid scheduling block removal when the name-node is in safe-mode.
      2. I was not able to get the upgrade -force option to work. This option lets the name-node proceed with a distributed upgrade even if
        the data-nodes are not able to complete their local upgrades. Did we test this feature at all for the generation stamp upgrade?

      Attachments

        1. HADOOP-3677-trunk.patch
          5 kB
          Raghu Angadi
        2. HADOOP-3677-trunk.patch
          44 kB
          Raghu Angadi
        3. HADOOP-3677-branch-018.patch
          39 kB
          Raghu Angadi

        Issue Links

          Activity

            One workaround is as follows:

            1. Shut down namenode and then restart namenode (with existing release). This will cause datanodes to send block reports and delete blocks that are not in the namespace.

            2. Shutdown cluster. Install new software on all nodes. Restart with -upgrade option. This will not have to delete blocks becuase orphaned blocks were already deleted in Step-1.

            If this workaround sounds feasible, then we can remove this issue from the 0.18 Blocker list.

            dhruba Dhruba Borthakur added a comment - One workaround is as follows: 1. Shut down namenode and then restart namenode (with existing release). This will cause datanodes to send block reports and delete blocks that are not in the namespace. 2. Shutdown cluster. Install new software on all nodes. Restart with -upgrade option. This will not have to delete blocks becuase orphaned blocks were already deleted in Step-1. If this workaround sounds feasible, then we can remove this issue from the 0.18 Blocker list.

            > If this workaround sounds feasible, then we can remove this issue from the 0.18 Blocker list.

            And document this procedure in release notes?

            shv Konstantin Shvachko added a comment - > If this workaround sounds feasible, then we can remove this issue from the 0.18 Blocker list. And document this procedure in release notes?
            rangadi Raghu Angadi added a comment -

            I think it is better to fix the problem than asking users to go through different procedure for upgrade. There are users with vastly different expertise and caution. What happens if they don't follow the procedure by mistake?

            Also this special work around needs to be followed in even in future when some one is upgrading from 0.17 to 0.20.

            rangadi Raghu Angadi added a comment - I think it is better to fix the problem than asking users to go through different procedure for upgrade. There are users with vastly different expertise and caution. What happens if they don't follow the procedure by mistake? Also this special work around needs to be followed in even in future when some one is upgrading from 0.17 to 0.20.

            May be a good solution would be to covert the distributed upgrade into local data-node upgrade.
            It will solve both of the problems above, plus eliminate the warning message reported in HADOOP-3732.
            The only disadvantage of this approach I can see is that data-nodes will take rather long time to startup, around 5 minutes each on a large cluster.
            But this can be solved by including reasonable messages about the upgrade progress.

            shv Konstantin Shvachko added a comment - May be a good solution would be to covert the distributed upgrade into local data-node upgrade. It will solve both of the problems above, plus eliminate the warning message reported in HADOOP-3732 . The only disadvantage of this approach I can see is that data-nodes will take rather long time to startup, around 5 minutes each on a large cluster. But this can be solved by including reasonable messages about the upgrade progress.

            I disagree with Raghu to a certain extent. From my experience, when an administrator wants to upgrade the cluster, he/she restarts the namenode (before installing new software) so that that transaction log is consumed. This ensures that restarting with a new version of the software does not cause any unwanted interactions with edits-log-processing.

            The workaround listed in this issue is just an extension of the above procedure.

            Konstantin: can you pl explain what is a local data-node upgrade? and why it solves this bug? Thanks.

            dhruba Dhruba Borthakur added a comment - I disagree with Raghu to a certain extent. From my experience, when an administrator wants to upgrade the cluster, he/she restarts the namenode (before installing new software) so that that transaction log is consumed. This ensures that restarting with a new version of the software does not cause any unwanted interactions with edits-log-processing. The workaround listed in this issue is just an extension of the above procedure. Konstantin: can you pl explain what is a local data-node upgrade? and why it solves this bug? Thanks.
            rangadi Raghu Angadi added a comment -

            If the procedure is optional, then its ok. I was mainly commenting about "required" extra steps. This this was a blocker I thought workaround was required.

            rangadi Raghu Angadi added a comment - If the procedure is optional, then its ok. I was mainly commenting about "required" extra steps. This this was a blocker I thought workaround was required.
            rangadi Raghu Angadi added a comment -

            I haven't looked at the code around the upgrade or generation stamps yet. I think the basic fix (or work around) that Konstantin suggesting is to rename all the metadata files with default generation stamp when Datanode starts before reporting them to Namenode.

            Main advantage I see is that it will avoid extra warnings for every block.

            rangadi Raghu Angadi added a comment - I haven't looked at the code around the upgrade or generation stamps yet. I think the basic fix (or work around) that Konstantin suggesting is to rename all the metadata files with default generation stamp when Datanode starts before reporting them to Namenode. Main advantage I see is that it will avoid extra warnings for every block.

            Changing the distributed -upgrade to a local upgrade sounds ok (though not very essential) to me. It seems to be another hack: the real problem being that the distributed framework do not yet have a mechanism to specify that "do not send block reports before distributed upgrade is complete".

            Another alternative would be to keep the namenode in safemode till distributed upgrade is complete.... isn't this already true today?

            dhruba Dhruba Borthakur added a comment - Changing the distributed -upgrade to a local upgrade sounds ok (though not very essential) to me. It seems to be another hack: the real problem being that the distributed framework do not yet have a mechanism to specify that "do not send block reports before distributed upgrade is complete". Another alternative would be to keep the namenode in safemode till distributed upgrade is complete.... isn't this already true today?
            rangadi Raghu Angadi added a comment -

            > ... specify that "do not send block reports before distributed upgrade is complete".
            Yes, we can fix it with more features like this. Still we will be left with thousands of warning messages. Question is what do we do for this jira.

            Whether a local upgrade is a hack I think is debatable. It makes logical sense to me : Datanode metada file name format has changed between 0.17 and 0.18, so datanode converts these names to new format when it is upgraded.

            In any case, a hack only the core developers need to know might be more desirable than a hack in upgrade procedure that all admins need to be aware of.

            If there consensus to convert metadata file name when datanode starts up, then I will submit a patch.

            rangadi Raghu Angadi added a comment - > ... specify that "do not send block reports before distributed upgrade is complete". Yes, we can fix it with more features like this. Still we will be left with thousands of warning messages. Question is what do we do for this jira. Whether a local upgrade is a hack I think is debatable. It makes logical sense to me : Datanode metada file name format has changed between 0.17 and 0.18, so datanode converts these names to new format when it is upgraded. In any case, a hack only the core developers need to know might be more desirable than a hack in upgrade procedure that all admins need to be aware of. If there consensus to convert metadata file name when datanode starts up, then I will submit a patch.

            Hi Raghu, I am perfectly ok with the fix you are proposing. Thanks for taking this up.

            dhruba Dhruba Borthakur added a comment - Hi Raghu, I am perfectly ok with the fix you are proposing. Thanks for taking this up.

            The advantages of the local upgrade (or as we also call it "version upgrade") as opposed to a distributed upgrade are that it avoids both problems stated in this jira, and also avoids thousands of warnings during data-node startup.
            In my opinion, local/version upgrade is logically correct and is not a "hack" at all, because each data-node can complete the upgrade on its own without interacting with other data-nodes or the name-node. The distributed upgrade should be used when such intercommunications are required. E.g. during the crc-upgrade this was unavoidable.
            The startup time will be the only disadvantage so the upgrade progress should be logged every 20-30 seconds.

            shv Konstantin Shvachko added a comment - The advantages of the local upgrade (or as we also call it "version upgrade") as opposed to a distributed upgrade are that it avoids both problems stated in this jira, and also avoids thousands of warnings during data-node startup. In my opinion, local/version upgrade is logically correct and is not a "hack" at all, because each data-node can complete the upgrade on its own without interacting with other data-nodes or the name-node. The distributed upgrade should be used when such intercommunications are required. E.g. during the crc-upgrade this was unavoidable. The startup time will be the only disadvantage so the upgrade progress should be logged every 20-30 seconds.
            rangadi Raghu Angadi added a comment -

            Suggested patch for trunk. This has the minimum changes for upgrade to work and disables the distributed upgrade. It does not take any longer than normal version upgrade and thus does not need more notifications than normal upgrade has.

            Once this fix is ok, I will submit a patch that removes code related to distributed upgrade.. or it could be done in a seperate jira.

            rangadi Raghu Angadi added a comment - Suggested patch for trunk. This has the minimum changes for upgrade to work and disables the distributed upgrade. It does not take any longer than normal version upgrade and thus does not need more notifications than normal upgrade has. Once this fix is ok, I will submit a patch that removes code related to distributed upgrade.. or it could be done in a seperate jira.

            I like this approach because it combines hard-linking with renaming and therefore does the task with no overhead.

            1. I agree the distributed upgrade code should be removed if we do this.
            2. constant oldMetaFileNamePattern should be in all capital letters, I'd prefer name smth like
              PRE_GENERATION_STAMP_META_FILE_PATTERN
            3. linkBlocks() does not need newLV parameter.
            4. Spelling "currect"
            shv Konstantin Shvachko added a comment - I like this approach because it combines hard-linking with renaming and therefore does the task with no overhead. I agree the distributed upgrade code should be removed if we do this. constant oldMetaFileNamePattern should be in all capital letters, I'd prefer name smth like PRE_GENERATION_STAMP_META_FILE_PATTERN linkBlocks() does not need newLV parameter. Spelling "currect"
            rangadi Raghu Angadi added a comment -

            Thanks for the review Konstantin.

            Attached patches for trunk and 0.18 has the suggested changes and removes GenStamp distributed upgrade.

            The patch is large mainly because it removes around 1000 lines. 3 files are deleted in trunk and one in 0.18.

            rangadi Raghu Angadi added a comment - Thanks for the review Konstantin. Attached patches for trunk and 0.18 has the suggested changes and removes GenStamp distributed upgrade. The patch is large mainly because it removes around 1000 lines. 3 files are deleted in trunk and one in 0.18.
            hadoopqa Hadoop QA added a comment -

            -1 overall. Here are the results of testing the latest attachment
            http://issues.apache.org/jira/secure/attachment/12386129/HADOOP-3677-trunk.patch
            against trunk revision 677470.

            +1 @author. The patch does not contain any @author tags.

            -1 tests included. The patch doesn't appear to include any new or modified tests.
            Please justify why no tests are needed for this patch.

            +1 javadoc. The javadoc tool did not generate any warning messages.

            +1 javac. The applied patch does not increase the total number of javac compiler warnings.

            +1 findbugs. The patch does not introduce any new Findbugs warnings.

            +1 release audit. The applied patch does not increase the total number of release audit warnings.

            +1 core tests. The patch passed core unit tests.

            +1 contrib tests. The patch passed contrib unit tests.

            Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/testReport/
            Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
            Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/artifact/trunk/build/test/checkstyle-errors.html
            Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/console

            This message is automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12386129/HADOOP-3677-trunk.patch against trunk revision 677470. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2885/console This message is automatically generated.
            rangadi Raghu Angadi added a comment -

            I just committed this.

            rangadi Raghu Angadi added a comment - I just committed this.
            hudson Hudson added a comment -
            hudson Hudson added a comment - Integrated in Hadoop-trunk #581 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/581/ )

            People

              rangadi Raghu Angadi
              shv Konstantin Shvachko
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: