Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1875

MiniDFSCluster hard-codes dfs.datanode.address to localhost

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.23.0
    • Component/s: test
    • Labels:
      None

      Description

      When creating RPC addresses that represent the communication sockets for each simulated DataNode, the MiniDFSCluster class hard-codes the address of the dfs.datanode.address port to be "127.0.0.1:0"

      The DataNodeCluster test tool uses the MiniDFSCluster class to create a selected number of simulated datanodes on a single host. In the DataNodeCluster setup, the NameNode is not simulated but is started as a separate daemon.

      The problem is that if the write requrests into the simulated datanodes are originated on a host that is not the same host running the simulated datanodes, the connections are refused. This is because the RPC sockets that are started by MiniDFSCluster are for "localhost" (127.0.0.1) and are not accessible from outside that same machine.

      It is proposed that the MiniDFSCluster.setupDatanodeAddress() method be overloaded in order to accommodate an environment where the NameNode is on one host, the client is on another host, and the simulated DataNodes are on yet another host (or even multiple hosts simulating multiple DataNodes each).

      The overloaded API would add a parameter that would be used as the basis for creating the RPS sockets. By default, it would remain 127.0.0.1

      1. HDFS-1875.patch
        11 kB
        Eric Payne
      2. HDFS-1875.patch
        9 kB
        Eric Payne

        Activity

        Hide
        Tanping Wang added a comment -

        I like this idea. It would be really useful if we can have multiple simulated data nodes binded to different hosts and dfs client binded to a particular host. And futher down the road, some of the simulated data nodes on different hosts, but the same rack. We can use this to test network topology distance related issues.

        One of the related problem that I ran into was that the order of data nodes in LocatedBlock returned by name nodes is sorted by NetworkTopology#pseudoSortByDistance(). In current Mini dfs cluster, there is no way I can bind the client to a host or bind a simulated data node to a particular host/rack. It would be nice if mini dfs cluster can make this possible, so that the network topology distance of client to each data node is fixed. Therefore, the order of data nodes returned within a LocatedBlock on MiniDFS cluster is fixed. Currently the order of data nodes in LocatedBlock is randomly sorted which means NetworkTopology understand the DFSClient and simulated datanodes are not different hosts and different racks.

        Also in currently Mini DFS client provides the opton of -racks when starting data nodes. But we can not bind multiple simulated data nodes to one rack... so it is not really that useful.

        Show
        Tanping Wang added a comment - I like this idea. It would be really useful if we can have multiple simulated data nodes binded to different hosts and dfs client binded to a particular host. And futher down the road, some of the simulated data nodes on different hosts, but the same rack. We can use this to test network topology distance related issues. One of the related problem that I ran into was that the order of data nodes in LocatedBlock returned by name nodes is sorted by NetworkTopology#pseudoSortByDistance(). In current Mini dfs cluster, there is no way I can bind the client to a host or bind a simulated data node to a particular host/rack. It would be nice if mini dfs cluster can make this possible, so that the network topology distance of client to each data node is fixed. Therefore, the order of data nodes returned within a LocatedBlock on MiniDFS cluster is fixed. Currently the order of data nodes in LocatedBlock is randomly sorted which means NetworkTopology understand the DFSClient and simulated datanodes are not different hosts and different racks. Also in currently Mini DFS client provides the opton of -racks when starting data nodes. But we can not bind multiple simulated data nodes to one rack... so it is not really that useful.
        Hide
        Eric Payne added a comment -

        I added an additional aprameter to private void setupDatanodeAddress() in MiniDFSCluster.java. If true, this parameter tells setupDatanodeAddress() to check the following properties before overriding them with localhost:
        dfs.datanode.address
        dfs.datanode.http.address
        dfs.datanode.ipc.address

        If the new parameter is true, setupDatanodeAddress() will use the above properties in creating sockets if those properties exist. Otherwise, setupDatanodeAddress() will use localhost, as before.

        Show
        Eric Payne added a comment - I added an additional aprameter to private void setupDatanodeAddress() in MiniDFSCluster.java. If true, this parameter tells setupDatanodeAddress() to check the following properties before overriding them with localhost: dfs.datanode.address dfs.datanode.http.address dfs.datanode.ipc.address If the new parameter is true, setupDatanodeAddress() will use the above properties in creating sockets if those properties exist. Otherwise, setupDatanodeAddress() will use localhost, as before.
        Hide
        Eric Payne added a comment -

        Patch to MiniDFSCluster and DATANodeCluster

        Show
        Eric Payne added a comment - Patch to MiniDFSCluster and DATANodeCluster
        Hide
        Eric Payne added a comment -

        No change to existing APIs or functionality. Additional functionality was added as overloaded API.

        Show
        Eric Payne added a comment - No change to existing APIs or functionality. Additional functionality was added as overloaded API.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12479677/HDFS-1875.patch
        against trunk revision 1124459.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 9 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
        org.apache.hadoop.hdfs.TestFileConcurrentReader
        org.apache.hadoop.tools.TestJMXGet

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479677/HDFS-1875.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/573//console This message is automatically generated.
        Hide
        Eric Payne added a comment -

        Test failures are not related to this patch.

        They were failing in several of the previous builds as well. Reference Build #556, for e.g.

        Show
        Eric Payne added a comment - Test failures are not related to this patch. They were failing in several of the previous builds as well. Reference Build #556, for e.g.
        Hide
        Eric Payne added a comment -

        In order to keep the scope of this Jira small, I have opened HDFS-1962 to cover Tanping's topology enhancement idea.

        Show
        Eric Payne added a comment - In order to keep the scope of this Jira small, I have opened HDFS-1962 to cover Tanping's topology enhancement idea.
        Hide
        Eric Payne added a comment -

        I have verified that the attached patch also applies to the yahoo-merge branch.

        Show
        Eric Payne added a comment - I have verified that the attached patch also applies to the yahoo-merge branch.
        Hide
        Matt Foley added a comment -

        Hi Eric, looks generally good. A few comments:

        1. MiniDFSCluster
          • startDataNodes() - the "@param checkDataNodeAddrConfig" comment has been added to the javadocs for the stubbed-out original method signature. It needs to be associated with the new longer-args-list method signature.
        1. TestDFSAddressConfig
          • Configuration.unset() method is only available in trunk, not yahoo-merge, at this time. This was added to trunk in HADOOP-7001 on Nov 24, 2010, but seems to not be in yahoo-merge. Have you asked Suresh to merge HADOOP-7001 to yahoo-merge?
        1. DataNodeCluster
          • In the class javadocs and/or the USAGE string, should document the change in semantics - that the datanode addresses specified in config will be used unless not set.

        Also, with this setup, if you then run the Client on the same server as the DataNodeCluster, can the Client communicate successfully? Or does it try to use 127.0.0.1 and fail to connect? Have you searched for other unit test classes that use a Client and a DataNodeCluster? Do they all still work?

        Thanks.

        Show
        Matt Foley added a comment - Hi Eric, looks generally good. A few comments: MiniDFSCluster startDataNodes() - the "@param checkDataNodeAddrConfig" comment has been added to the javadocs for the stubbed-out original method signature. It needs to be associated with the new longer-args-list method signature. TestDFSAddressConfig Configuration.unset() method is only available in trunk, not yahoo-merge, at this time. This was added to trunk in HADOOP-7001 on Nov 24, 2010, but seems to not be in yahoo-merge. Have you asked Suresh to merge HADOOP-7001 to yahoo-merge? DataNodeCluster In the class javadocs and/or the USAGE string, should document the change in semantics - that the datanode addresses specified in config will be used unless not set. Also, with this setup, if you then run the Client on the same server as the DataNodeCluster, can the Client communicate successfully? Or does it try to use 127.0.0.1 and fail to connect? Have you searched for other unit test classes that use a Client and a DataNodeCluster? Do they all still work? Thanks.
        Hide
        Matt Foley added a comment -

        Assumed context for the last question above is a box where datanode config addresses are specified in the environment, e.g. by hdfs-site.xml.

        Show
        Matt Foley added a comment - Assumed context for the last question above is a box where datanode config addresses are specified in the environment, e.g. by hdfs-site.xml.
        Hide
        Eric Payne added a comment -

        Updated after review comments.

        Show
        Eric Payne added a comment - Updated after review comments.
        Hide
        Eric Payne added a comment -

        Thanks for your review, Matt.

        I have submitted a new patch that should address the conerns raised in your comment above.

        1. MiniDFSCluster

        • I reverted the original function javadocs header and created a new one for the new startDataNodes() method.

        2. TestDFSAddressConfig

        • Nice catch on this one. I added HADOOP-7001 to the list for merge into yahoo-trunk, and it has now been merged.

        3. With the previous patch, there were some cases where DataNodeCluster was getting errors with conflicting ports. So, in the new patch, I added a command-line parameter (-checkDataNodeAddrConfig) that will implement the new functionality. That is, with the new parameter, it will use the dfs.datanode.address.* values from the config, if set. Otherwise, if the -checkDataNodeAddrConfig parameter is not set, it will keep the existing behavior.

        I have tested the behaviors with and without the parameter, and the results are as expected.

        Show
        Eric Payne added a comment - Thanks for your review, Matt. I have submitted a new patch that should address the conerns raised in your comment above. 1. MiniDFSCluster I reverted the original function javadocs header and created a new one for the new startDataNodes() method. 2. TestDFSAddressConfig Nice catch on this one. I added HADOOP-7001 to the list for merge into yahoo-trunk, and it has now been merged. 3. With the previous patch, there were some cases where DataNodeCluster was getting errors with conflicting ports. So, in the new patch, I added a command-line parameter (-checkDataNodeAddrConfig) that will implement the new functionality. That is, with the new parameter, it will use the dfs.datanode.address.* values from the config, if set. Otherwise, if the -checkDataNodeAddrConfig parameter is not set, it will keep the existing behavior. I have tested the behaviors with and without the parameter, and the results are as expected.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12481423/HDFS-1875.patch
        against trunk revision 1131264.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 9 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.TestHDFSTrash

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//testReport/
        Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12481423/HDFS-1875.patch against trunk revision 1131264. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/705//console This message is automatically generated.
        Hide
        Matt Foley added a comment -

        +1. The one auto-test failure is unrelated.

        Committed to trunk. Thanks, Eric!

        Show
        Matt Foley added a comment - +1. The one auto-test failure is unrelated. Committed to trunk. Thanks, Eric!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #746 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/746/ )
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #699 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/)

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #699 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/699/ )

          People

          • Assignee:
            Eric Payne
            Reporter:
            Eric Payne
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development