Description
Issue originally reported by Karam Singh.
All OrderWordCount, WordCount and Tez tests faultTolerance system tests failed due to java.net.UnknownHostException
Interesting other tez examples such as mrrsleep, randomwriter, randomtextwriter, sort, join_inner, join_outer, terasort, groupbyorderbymrrtest ran fine
one such example is following
RUNNING: /usr/lib/hadoop/bin/hadoop jar /usr/lib/tez/tez-mapreduce-examples-0.4.0.2.1.7.0-784.jar orderedwordcount "-DUSE_TEZ_SESSION=true" "-Dmapreduce.map.memory.mb=2048" "-Dtez.am.shuffle-vertex-manager.max-src-fraction=0" "-Dmapreduce.reduce.memory.mb=2048" "-Dmapreduce.framework.name=yarn-tez" "-Dtez.am.container.reuse.enabled=false" "-Dtez.am.log.level=DEBUG" "-Dmapreduce.map.java.opts=-Xmx1024m" "-Dtez.am.shuffle-vertex-manager.min-src-fraction=0" "-Dmapreduce.job.reduce.slowstart.completedmaps=0.01" "-Dmapreduce.reduce.java.opts=-Xmx1024m" "-Dtez.am.container.session.delay-allocation-millis=120000" /user/hrt_qa/Tez_CR_1/TestContainerReuse1 /user/hrt_qa/Tez_CROutput_1 /user/hrt_qa/Tez_CR_2/TestContainerReuse2 /user/hrt_qa/Tez_CROutput_2 -generateSplitsInClient true 14/12/19 09:20:05 INFO impl.TimelineClientImpl: Timeline service address: http://0.0.0.0:8188/ws/v1/timeline/ 14/12/19 09:20:05 INFO client.RMProxy: Connecting to ResourceManager at headnode0.humb-tez1-ssh.d5.internal.cloudapp.net/10.0.0.87:8050 14/12/19 09:20:05 INFO client.AHSProxy: Connecting to Application History server at /0.0.0.0:10200 14/12/19 09:20:06 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 14/12/19 09:20:06 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 60 second(s). 14/12/19 09:20:06 INFO impl.MetricsSystemImpl: azure-file-system metrics system started 14/12/19 09:20:07 INFO client.TezClientUtils: Permissions on staging directory wasb://humb-tez1@humboldttesting.blob.core.windows.net/user/hrt_qa/.staging/application_1418977790315_0016 are incorrect: rwxr-xr-x. Fixing permissions to correct value rwx------ 14/12/19 09:20:07 INFO examples.OrderedWordCount: Creating Tez Session 14/12/19 09:20:07 INFO impl.TimelineClientImpl: Timeline service address: http://0.0.0.0:8188/ws/v1/timeline/ 14/12/19 09:20:07 INFO client.RMProxy: Connecting to ResourceManager at headnode0.humb-tez1-ssh.d5.internal.cloudapp.net/10.0.0.87:8050 14/12/19 09:20:07 INFO client.AHSProxy: Connecting to Application History server at /0.0.0.0:10200 14/12/19 09:20:09 INFO impl.YarnClientImpl: Submitted application application_1418977790315_0016 14/12/19 09:20:09 INFO examples.OrderedWordCount: Created Tez Session 14/12/19 09:20:09 INFO examples.OrderedWordCount: Running OrderedWordCount DAG, dagIndex=1, inputPath=/user/hrt_qa/Tez_CR_1/TestContainerReuse1, outputPath=/user/hrt_qa/Tez_CROutput_1 14/12/19 09:20:09 INFO hadoop.MRHelpers: Generating new input splits, splitsDir=wasb://humb-tez1@humboldttesting.blob.core.windows.net/user/hrt_qa/.staging/application_1418977790315_0016 14/12/19 09:20:09 INFO input.FileInputFormat: Total input paths to process : 20 14/12/19 09:20:09 INFO examples.OrderedWordCount: Waiting for TezSession to get into ready state 14/12/19 09:20:14 INFO client.TezSession: Failed to retrieve AM Status via proxy org.apache.tez.dag.api.TezException: com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "workernode1":59575; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost at org.apache.tez.client.TezSession.getSessionStatus(TezSession.java:351) at org.apache.tez.mapreduce.examples.OrderedWordCount.waitForTezSessionReady(OrderedWordCount.java:538) at org.apache.tez.mapreduce.examples.OrderedWordCount.main(OrderedWordCount.java:461) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "workernode1":59575; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216) at com.sun.proxy.$Proxy24.getAMStatus(Unknown Source) at org.apache.tez.client.TezSession.getSessionStatus(TezSession.java:337) ... 14 more Caused by: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "workernode1":59575; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:742) at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:400) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1452) at org.apache.hadoop.ipc.Client.call(Client.java:1381) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) ... 16 more Caused by: java.net.UnknownHostException ... 21 more .................... .................... Caused by: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "workernode1":59575; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost at sun.reflect.GeneratedConstructorAccessor22.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:742) at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:400) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1452) at org.apache.hadoop.ipc.Client.call(Client.java:1381) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) ... 16 more Caused by: java.net.UnknownHostException ... 21 more 14/12/19 09:25:19 ERROR examples.OrderedWordCount: Error occurred when submitting/running DAGs java.lang.RuntimeException: TezSession has already shutdown at org.apache.tez.mapreduce.examples.OrderedWordCount.waitForTezSessionReady(OrderedWordCount.java:540) at org.apache.tez.mapreduce.examples.OrderedWordCount.main(OrderedWordCount.java:461) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 14/12/19 09:25:19 INFO examples.OrderedWordCount: Shutting down session 14/12/19 09:25:19 INFO client.TezSession: Shutting down Tez Session, sessionName=OrderedWordCountSession, applicationId=application_1418977790315_0016 14/12/19 09:25:19 INFO client.TezSession: Failed to shutdown Tez Session via proxy org.apache.tez.dag.api.SessionNotRunning: Application not running, applicationId=application_1418977790315_0016, yarnApplicationState=FINISHED, finalApplicationStatus=SUCCEEDED, trackingUrl=http://headnode0.humb-tez1-ssh.d5.internal.cloudapp.net:8088/proxy/application_1418977790315_0016/A at org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:733) at org.apache.tez.client.TezSession.stop(TezSession.java:281) at org.apache.tez.mapreduce.examples.OrderedWordCount.main(OrderedWordCount.java:524) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 14/12/19 09:25:19 INFO client.TezSession: Could not connect to AM, killing session via YARN, sessionName=OrderedWordCountSession, applicationId=application_1418977790315_0016 14/12/19 09:25:19 INFO impl.YarnClientImpl: Killed application application_1418977790315_0016 java.lang.RuntimeException: TezSession has already shutdown at org.apache.tez.mapreduce.examples.OrderedWordCount.waitForTezSessionReady(OrderedWordCount.java:540) at org.apache.tez.mapreduce.examples.OrderedWordCount.main(OrderedWordCount.java:461) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:88) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Contents of /etc/hosts are:
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
and contents of resolv.conf are:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 168.63.129.16
search humb-tez1-ssh.d5.internal.cloudapp.net