Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Since TEZ-4236 (in Tez 0.10.1), tez local mode can run without even starting an RPC server in the DAGAppMaster, which is in the same JVM as the client.
Adapting tez.local.mode.without.network=true could make tez.local.mode=true unit tests more stable.
here is an example where I had no idea why the dag app master connection was refused:
2022-07-29T07:56:24,701 INFO [main_executor] ql.Driver: Executing command(queryId=jenkins_20220729075624_b3ba4c8a-82d5-4ebd-b4b0-218325a71b10): INSERT into table default.tmp_minor_compactor_testmmminorcompaction_1659106584519_result select `a`, `b` from default.tmp_minor_compactor_testmmminorcompaction_1659106584519 2022-07-29T07:56:24,823 INFO [ServiceThread:DAGClientRPCServer] client.DAGClientServer: Instantiated DAGClientRPCServer at internal-hive-flaky-check-88-xwmrs-v2h77-knnxx/10.106.3.19:22623 2022-07-29T07:56:24,823 INFO [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] rm.TaskSchedulerManager: Creating TaskScheduler: Local TaskScheduler with clusterIdentifier=1659106584728 2022-07-29T07:56:24,825 INFO [DAGAppMaster Thread] HistoryEventHandler.criticalEvents: [HISTORY][DAG:N/A][Event:AM_STARTED]: appAttemptId=appattempt_1659106584728_0001_000000, startTime=1659106584825 2022-07-29T07:56:24,825 INFO [DAGAppMaster Thread] app.DAGAppMaster: In Session mode. Waiting for DAG over RPC 2022-07-29T07:56:24,871 INFO [main_executor] client.LocalClient: DAGAppMaster state: IDLE 2022-07-29T07:56:24,871 INFO [main_executor] client.TezClient: The url to track the Tez Session: N/A ... 2022-07-29T07:56:46,384 INFO [main_executor] client.TezClient: Failed to retrieve AM Status via proxy com.google.protobuf.ServiceException: java.net.ConnectException: Call From internal-hive-flaky-check-88-xwmrs-v2h77-knnxx/10.106.3.19 to internal-hive-flaky-check-88-xwmrs-v2h77-knnxx:22623 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:247) ~[hadoop-common-3.1.1.7.2.15.0-147.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) ~[hadoop-common-3.1.1.7.2.15.0-147.jar:?] at com.sun.proxy.$Proxy50.getAMStatus(Unknown Source) ~[?:?]
instead of diving deep into an evil environment related bug, we can simply utilize TEZ-4236 in these cases too
Attachments
Issue Links
- is related to
-
TEZ-4236 DAGClientServer is not really needed to be started/used in local mode
-
- Resolved
-
- relates to
-
HIVE-26445 Use tez.local.mode.without.network for qtests
-
- In Progress
-
- links to