Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-8887

ClusterClient.getJobStatus can throw FencingTokenException

    XMLWordPrintableJSON

Details

    Description

      Description
      Calling RestClusterClient.getJobStatus or MiniClusterClient.getJobStatus can result in a FencingTokenException.

      Analysis
      Dispatcher.requestJobStatus first looks the JobManagerRunner up by job id. If a reference is found, requestJobStatus is called on the respective instance. If not, the ArchivedExecutionGraphStore is queried. However, between the lookup and the method call, the JobMaster of the respective job may have lost leadership already (job finished), and has set the fencing token to null.

      Stacktrace

      Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token mismatch: Ignoring message LocalFencedMessage(null, LocalRpcInvocation(requestJobStatus(Time))) because the fencing token null did not match the expected fencing token b8423c75bc6838244b8c93c8bd4a4f51.
      	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:73)
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132)
      	at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
      	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
      	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      
      Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing token not set: Ignoring message LocalFencedMessage(null, LocalRpcInvocation(requestJobStatus(Time))) because the fencing token is null.
      	at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleMessage(FencedAkkaRpcActor.java:56)
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:132)
      	at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
      	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
      	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
      	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
      	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
      	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
      	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
      	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
      	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      

      Attachments

        Activity

          People

            trohrmann Till Rohrmann
            gjy Gary Yao
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: