Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20036

Hive Compactor MapReduce task keeps failing due to wrong hadoop URI.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • 2.3.3
    • None
    • Metastore
    • None

    Description

      I'm using Hive 2.3.3 with Hadoop 3.0.0 and Spark 2.2.1.

      I've created a partitioned orc table and enabled compaction. 

      But the compaction task keeps failing and complains that a URI cannot be resolved.

      here is the yarn application diagnostics log:

      Application application_1529550480937_0033 failed 2 times due to AM Container for appattempt_1529550480937_0033_000002 exited with exitCode: -1000
      Failing this attempt.Diagnostics: [2018-06-29 17:25:25.656]Port 8020 specified in URI hdfs://hadoopcluster:8020/tmp/hadoop-yarn/staging/smsuser/.staging/job_1529550480937_0033/job.splitmetainfo but host 'hadoopcluster' is a logical (HA) namenode and does not use port information.
      java.io.IOException: Port 8020 specified in URI hdfs://hadoopcluster:8020/tmp/hadoop-yarn/staging/smsuser/.staging/job_1529550480937_0033/job.splitmetainfo but host 'hadoopcluster' is a logical (HA) namenode and does not use port information.
      at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:266)
      at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:217)
      at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:127)
      at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:355)
      at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:289)
      at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:163)
      at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288)
      at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
      at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3337)
      at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3305)
      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:476)
      at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
      at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
      at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
      at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366)
      at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
      at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234)
      at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      For more detailed output, check the application tracking page: http://cluster-master:8088/cluster/app/application_1529550480937_0033 Then click on links to logs of each attempt.
      . Failing the application.
      

      This is my core-site.xml and hdfs-site.xml

      <configuration>
      <property>
      <name>hadoop.tmp.dir</name>
      <value>file:/opt/hdfs/tmp/</value>
      <description>A base for other temporary directories.</description>
      </property>
      
      <property>
      <name>io.file.buffer.size</name>
      <!-- 128k -->
      <value>131072</value>
      </property>
      
      <property>
      <name>fs.defaultFS</name>
      <value>hdfs://hadoopcluster</value>
      </property>
      
      <property>
      <name>hadoop.proxyuser.smsuser.hosts</name>
      <value>*</value>
      </property>
      
      <property>
      <name>hadoop.proxyuser.smsuser.groups</name>
      <value>*</value>
      </property>
      
      </configuration>
      
      <configuration>
      <property>
      <name>dfs.nameservices</name>
      <value>hadoopcluster</value>
      </property>
      
      <property>
      <name>dfs.ha.namenodes.hadoopcluster</name>
      <value>cluster-master,cluster-backup</value>
      </property>
      
      <property>
      <name>dfs.namenode.rpc-address.hadoopcluster.cluster-master</name>
      <value>cluster-master:9820</value>
      </property>
      
      <property>
      <name>dfs.namenode.rpc-address.hadoopcluster.cluster-backup</name>
      <value>cluster-backup:9820</value>
      </property>
      
      <property>
      <name>dfs.namenode.http-address.hadoopcluster.cluster-master</name>
      <value>cluster-master:9870</value>
      </property>
      
      <property>
      <name>dfs.namenode.http-address.hadoopcluster.cluster-backup</name>
      <value>cluster-backup:9870</value>
      </property>
      
      <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://cluster-node1:8485;cluster-node2:8485;cluster-node3:8485/hadoopcluster</value>
      </property>
      
      <property>
      <name>dfs.client.failover.proxy.provider.hadoopcluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
      </property>
      
      <property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
      </property>
      
      <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/home/smsuser/.ssh/id_rsa</value>
      </property>
      
      <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/opt/hdfs/journal</value>
      </property>
      
      <property>
      <name>dfs.replication</name>
      <value>3</value>
      </property>
      
      <property>
      <name>dfs.namenode.name.dir</name>
      <value>/opt/hdfs/name</value>
      </property>
      
      <property>
      <name>dfs.datanode.name.dir</name>
      <value>/opt/hdfs/data</value>
      </property>
      
      

       I guess there may be a configuration mistake but I failed to dig out after searching a lot and reading the src code.

       

      Please help me. Thanks a lot.

       

       

      Attachments

        Activity

          People

            Matrix0xCC Matrix0xCC
            Matrix0xCC Matrix0xCC
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: