Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-498

Kill YARN Job doesn't kill the container

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.8.0
    • Fix Version/s: None
    • Component/s: container, yarn
    • Labels:
      None

      Description

      I tried to kill a samza job with the kill-yarn-job.sh and the console output looks like

      2014-12-12 13:47:54 RMProxy [INFO] Connecting to ResourceManager at xxxxxxxx/10.2.0.79:8032
      2014-12-12 13:48:03 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      Killing application application_1418381509268_0016
      2014-12-12 13:48:03 YarnClientImpl [INFO] Killed application application_1418381509268_0016

      But the job wasn't killed. All containers are running and on the server the log looks like

      2014-12-12 13:48:04,003 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=user.name     IP=10.255.250.131      OPERATION=Kill Application Request      TARGET=ClientRMService    RESULT=SUCCESS  APPID=application_1418381509268_0016
      2014-12-12 13:48:04,565 ERROR org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Application attempt appattempt_1418381509268_0016_000002 doesn't exist in ApplicationMasterService cache.
      2014-12-12 13:48:04,565 INFO org.apache.hadoop.ipc.Server: IPC Server handler 27 on 8030, call org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 10.2.0.88:40934 Call#751 Retry#0
      org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: Application attempt appattempt_1418381509268_0016_000002 doesn't exist in ApplicationMasterService cache.
              at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:436)
              at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
              at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
      2014-12-12 13:48:05,202 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: application_1418381509268_0016 unregistered successfully. 
      2014-12-12 13:48:06,229 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...

      Any hints what I can try to do?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              fs Falk Scheerschmidt
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: