Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8887

ClusterClient.getJobStatus can throw FencingTokenException

    XMLWordPrintableJSON

    Details

      Description

      Description
      Calling RestClusterClient.getJobStatus or MiniClusterClient.getJobStatus can result in a FencingTokenException.

      Analysis
      Dispatcher.requestJobStatus first looks the JobManagerRunner up by job id. If a reference is found, requestJobStatus is called on the respective instance. If not, the ArchivedExecutionGraphStore is queried. However, between the lookup and the method call, the JobMaster of the respective job may have lost leadership already (job finished), and has set the fencing token to null.

      Stacktrace

      Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token mismatch: Ignoring message LocalFencedMessage(null, LocalRpcInvocation(requestJobStatus(Time))) because the fencing token null did not match the expected fencing token b8423c75bc6838244b8c93c8bd4a4f51.
      	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:73)
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132)
      	at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
      	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
      	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      
      Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token not set: Ignoring message LocalFencedMessage(null, LocalRpcInvocation(requestJobStatus(Time))) because the fencing token is null.
      	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:56)
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132)
      	at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
      	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
      	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      

        Attachments

          Activity

            People

            • Assignee:
              trohrmann Till Rohrmann
              Reporter:
              gjy Gary Yao
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: