Description
In the following 2 cases, AM would propagate wrong error message to client ("App master already running a DAG")
- The last dag is completed but AM is still in RUNNING state
- AM is in shutting down.
2015-04-10 06:01:50,369 INFO [IPC Server handler 0 on 46821] ipc.Server (Server.java:run(2070)) - IPC Server handler 0 on 46821, call org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG from 10.0.0.223:48581 Call#411 Retry#0 org.apache.tez.dag.api.TezException: App master already running a DAG at org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1131) at org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
Attachments
Attachments
Issue Links
- is related to
-
TEZ-1273 Refactor DAGAppMaster to state machine based
- Open