Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3404

Make putImage in GetImageServlet infer remote address to fetch from request

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.0.2-alpha
    • Component/s: None
    • Labels:
      None

      Description

      As it stands, daemons which perform checkpointing must determine their own address on which they can be reached, so that the NN which they checkpoint against knows what address to fetch a merged fsimage from. This causes problems if, for example, the daemon performing checkpointing binds to 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

      1. HDFS-3404.patch
        4 kB
        Aaron T. Myers
      2. HDFS-3404.patch
        3 kB
        Aaron T. Myers

        Issue Links

          Activity

          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #1078 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1078/)
          HDFS-3404. Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755)

          Result = SUCCESS
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1078 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1078/ ) HDFS-3404 . Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #1042 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1042/)
          HDFS-3404. Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755)

          Result = FAILURE
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1042 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1042/ ) HDFS-3404 . Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #2255 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2255/)
          HDFS-3404. Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755)

          Result = ABORTED
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #2255 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2255/ ) HDFS-3404 . Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Common-trunk-Commit #2238 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2238/)
          HDFS-3404. Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755)

          Result = SUCCESS
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #2238 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2238/ ) HDFS-3404 . Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #2312 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2312/)
          HDFS-3404. Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755)

          Result = SUCCESS
          atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755
          Files :

          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #2312 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2312/ ) HDFS-3404 . Make putImage in GetImageServlet infer remote address to fetch from request. Contributed by Aaron T. Myers. (Revision 1337755) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337755 Files : /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
          Hide
          Aaron T. Myers added a comment -

          Thanks a lot for the reviews, Eli. I've just committed this to trunk and branch-2.

          Show
          Aaron T. Myers added a comment - Thanks a lot for the reviews, Eli. I've just committed this to trunk and branch-2.
          Hide
          Eli Collins added a comment -

          +1 looks great

          Show
          Eli Collins added a comment - +1 looks great
          Hide
          Aaron T. Myers added a comment -

          Updating JIRA summary. Somehow I missed a pretty important word.

          Show
          Aaron T. Myers added a comment - Updating JIRA summary. Somehow I missed a pretty important word.
          Hide
          Aaron T. Myers added a comment -

          I just tested out this patch manually on a 2-node HA cluster and it worked as expected. Each node is able to checkpoint to the other, even though each node has bound both its own RPC and HTTP addresses 0.0.0.0. Each node need only have the RPC address of the other node set to an actual address for everything to work properly.

          Show
          Aaron T. Myers added a comment - I just tested out this patch manually on a 2-node HA cluster and it worked as expected. Each node is able to checkpoint to the other, even though each node has bound both its own RPC and HTTP addresses 0.0.0.0. Each node need only have the RPC address of the other node set to an actual address for everything to work properly.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12526479/HDFS-3404.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2424//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2424//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526479/HDFS-3404.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2424//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2424//console This message is automatically generated.
          Hide
          Aaron T. Myers added a comment -

          Here's an updated patch which addresses Eli's review feedback.

          I struggled for a while with how to write an automated test for this, and ultimately concluded it's not really possible on a single host, since connecting to 0.0.0.0 will work on a single box, whereas it wouldn't in a multi-box setup. I'll test this patch manually in a multi-node setup tomorrow.

          Show
          Aaron T. Myers added a comment - Here's an updated patch which addresses Eli's review feedback. I struggled for a while with how to write an automated test for this, and ultimately concluded it's not really possible on a single host, since connecting to 0.0.0.0 will work on a single box, whereas it wouldn't in a multi-box setup. I'll test this patch manually in a multi-node setup tomorrow.
          Hide
          Aaron T. Myers added a comment -

          This change needs to be made to the 2NN as well right or were you thinking just the SBN?

          Nope, there's no change to be made to the 2NN. The 2NN doesn't do the same sort of validation that the SBN does that the configured NN HTTP address is not INADDR_ANY. The 2NN will automatically start behaving in the same way the SBN does just by virtue of the fact that it's connecting to an NN which doesn't look at the machine name in the param string. The 2NN will also stop sending the machine name in the param string, by virtue of the fact that it uses GetImageServlet#getParamStringToPutImage to form the param string.

          I also tested this patch with an NN/2NN, and it works just fine.

          NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress.'

          Sure, makes sense. I'll update the patch to suit.

          How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance?

          I investigated what it would take to do this a little bit, and concluded that to do it right would take a fair bit of refactoring that's well outside the modest scope of this JIRA. I've filed a separate JIRA to make this change, I hope that's OK: HDFS-3405

          Show
          Aaron T. Myers added a comment - This change needs to be made to the 2NN as well right or were you thinking just the SBN? Nope, there's no change to be made to the 2NN. The 2NN doesn't do the same sort of validation that the SBN does that the configured NN HTTP address is not INADDR_ANY. The 2NN will automatically start behaving in the same way the SBN does just by virtue of the fact that it's connecting to an NN which doesn't look at the machine name in the param string. The 2NN will also stop sending the machine name in the param string, by virtue of the fact that it uses GetImageServlet#getParamStringToPutImage to form the param string. I also tested this patch with an NN/2NN, and it works just fine. NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress.' Sure, makes sense. I'll update the patch to suit. How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance? I investigated what it would take to do this a little bit, and concluded that to do it right would take a fair bit of refactoring that's well outside the modest scope of this JIRA. I've filed a separate JIRA to make this change, I hope that's OK: HDFS-3405
          Hide
          Todd Lipcon added a comment -

          How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance?

          Show
          Todd Lipcon added a comment - How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance?
          Hide
          Eli Collins added a comment -

          The approach - have the NN determine the hostname of the checkpointer from the request rather than have it passed as a parameter - seems more sane to me.

          • This change needs to be made to the 2NN as well right or were you thinking just the SBN?
          • NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress.
          Show
          Eli Collins added a comment - The approach - have the NN determine the hostname of the checkpointer from the request rather than have it passed as a parameter - seems more sane to me. This change needs to be made to the 2NN as well right or were you thinking just the SBN? NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress.
          Hide
          Aaron T. Myers added a comment -

          Here's an initial patch to make sure folks are OK with the approach. I'm still mulling over how best to write tests for this, which is a tad difficult on a single-node machine.

          I tested this manually by setting up an HA setup where each NN itself binds to 0.0.0.0, but has actual addresses for the other NN. It worked as expected.

          Show
          Aaron T. Myers added a comment - Here's an initial patch to make sure folks are OK with the approach. I'm still mulling over how best to write tests for this, which is a tad difficult on a single-node machine. I tested this manually by setting up an HA setup where each NN itself binds to 0.0.0.0, but has actual addresses for the other NN. It worked as expected.

            People

            • Assignee:
              Aaron T. Myers
              Reporter:
              Aaron T. Myers
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development