Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Not A Bug
-
2.3.3
-
None
-
None
Description
I'm using Hive 2.3.3 with Hadoop 3.0.0 and Spark 2.2.1.
I've created a partitioned orc table and enabled compaction.
But the compaction task keeps failing and complains that a URI cannot be resolved.
here is the yarn application diagnostics log:
Application application_1529550480937_0033 failed 2 times due to AM Container for appattempt_1529550480937_0033_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: [2018-06-29 17:25:25.656]Port 8020 specified in URI hdfs://hadoopcluster:8020/tmp/hadoop-yarn/staging/smsuser/.staging/job_1529550480937_0033/job.splitmetainfo but host 'hadoopcluster' is a logical (HA) namenode and does not use port information. java.io.IOException: Port 8020 specified in URI hdfs://hadoopcluster:8020/tmp/hadoop-yarn/staging/smsuser/.staging/job_1529550480937_0033/job.splitmetainfo but host 'hadoopcluster' is a logical (HA) namenode and does not use port information. at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:266) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:217) at org.apache.hadoop.hdfs.NameNodeProxiesClient.createProxyWithClientProtocol(NameNodeProxiesClient.java:127) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:355) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:289) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:163) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3288) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3337) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3305) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:476) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:366) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:364) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:241) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:234) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:222) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) For more detailed output, check the application tracking page: http://cluster-master:8088/cluster/app/application_1529550480937_0033 Then click on links to logs of each attempt. . Failing the application.
This is my core-site.xml and hdfs-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/opt/hdfs/tmp/</value> <description>A base for other temporary directories.</description> </property> <property> <name>io.file.buffer.size</name> <!-- 128k --> <value>131072</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://hadoopcluster</value> </property> <property> <name>hadoop.proxyuser.smsuser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.smsuser.groups</name> <value>*</value> </property> </configuration>
<configuration> <property> <name>dfs.nameservices</name> <value>hadoopcluster</value> </property> <property> <name>dfs.ha.namenodes.hadoopcluster</name> <value>cluster-master,cluster-backup</value> </property> <property> <name>dfs.namenode.rpc-address.hadoopcluster.cluster-master</name> <value>cluster-master:9820</value> </property> <property> <name>dfs.namenode.rpc-address.hadoopcluster.cluster-backup</name> <value>cluster-backup:9820</value> </property> <property> <name>dfs.namenode.http-address.hadoopcluster.cluster-master</name> <value>cluster-master:9870</value> </property> <property> <name>dfs.namenode.http-address.hadoopcluster.cluster-backup</name> <value>cluster-backup:9870</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://cluster-node1:8485;cluster-node2:8485;cluster-node3:8485/hadoopcluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.hadoopcluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/smsuser/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/hdfs/journal</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/hdfs/name</value> </property> <property> <name>dfs.datanode.name.dir</name> <value>/opt/hdfs/data</value> </property>
I guess there may be a configuration mistake but I failed to dig out after searching a lot and reading the src code.
Please help me. Thanks a lot.