Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9868

Add ability for DistCp to run between 2 clusters

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 2.7.1
    • None
    • distcp
    • None

    Description

      Normally the HDFS cluster is HA enabled. It could take a long time when coping huge data by distp. If the source cluster changes active namenode, the distp will run failed. This patch supports the DistCp can read source cluster files in HA access mode. A source cluster configuration file needs to be specified (via the -sourceClusterConf option).

      The following is an example of the contents of a source cluster configuration
      file:

          <configuration>
            <property>
      		<name>fs.defaultFS</name>
      		<value>hdfs://mycluster</value>
      	  </property>
      	  <property>
      		<name>dfs.nameservices</name>
      		<value>mycluster</value>
      	  </property>
      	  <property>
      		<name>dfs.ha.namenodes.mycluster</name>
      		<value>nn1,nn2</value>
      	  </property>
      	  <property>
      		<name>dfs.namenode.rpc-address.mycluster.nn1</name>
      		<value>host1:9000</value>
      	  </property>
      	  <property>
      		<name>dfs.namenode.rpc-address.mycluster.nn2</name>
      		<value>host2:9000</value>
      	  </property>
      	  <property>
      		<name>dfs.namenode.http-address.mycluster.nn1</name>
      		<value>host1:50070</value>
      	  </property>
      	  <property>
      		<name>dfs.namenode.http-address.mycluster.nn2</name>
      		<value>host2:50070</value>
      	  </property>
      	  <property>
      		<name>dfs.client.failover.proxy.provider.mycluster</name>
      		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
      	  </property>
      	</configuration>
      

      The invocation of DistCp is as below:

          bash$ hadoop distcp -sourceClusterConf sourceCluster.xml /foo/bar hdfs://nn2:8020/bar/foo
      

      Attachments

        1. HDFS-9868.1.patch
          24 kB
          NING DING
        2. HDFS-9868.2.patch
          25 kB
          NING DING
        3. HDFS-9868.3.patch
          26 kB
          NING DING
        4. HDFS-9868.4.patch
          26 kB
          NING DING
        5. HDFS-9868.05.patch
          29 kB
          Xiao Chen
        6. HDFS-9868.06.patch
          34 kB
          Xiao Chen
        7. HDFS-9868.07.patch
          34 kB
          Xiao Chen
        8. HDFS-9868.08.patch
          40 kB
          Xiao Chen
        9. HDFS-9868.09.patch
          44 kB
          Xiao Chen
        10. HDFS-9868.10.patch
          49 kB
          Xiao Chen

        Issue Links

          Activity

            People

              iceberg565 NING DING
              iceberg565 NING DING
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: