Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21606

TaskManager connected to invalid JobManager leading to TaskSubmissionException

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      While testing reactive mode, I had to start my JobManager a few times to get the configuration right. While doing that, I had at least on TaskManager (TM6), which was first connected to the first JobManager (with a running job), and then to the second one.

      On the second JobManager, I was able to execute my test job (on another TaskManager (TMx)), once TM6 reconnected, and reactive mode tried to utilize all available resources, I repeatedly ran into this issue:

      2021-03-04 15:49:36,322 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Window(GlobalWindows(), DeltaTrigger, TimeEvictor, ComparableAggregator, PassThroughWindowFunction) -> Sink: Print to Std. Out (5/7) (ae8f39c8dd88148aff93c8f811fab22e) switched from DEPLOYING to FAILED on 192.168.2.173:64041-4f7521 @ macbook-pro-2.localdomain (dataPort=64044).
      java.util.concurrent.CompletionException: org.apache.flink.runtime.taskexecutor.exceptions.TaskSubmissionException: Could not submit task because there is no JobManager associated for the job bbe8634736b5b1d813dd322cfaaa08ea.
      	at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_252]
      	at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_252]
      	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_252]
      	at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1064) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.OnComplete.internal(Future.scala:263) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.OnComplete.internal(Future.scala:261) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:101) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:999) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:517) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:458) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      Caused by: org.apache.flink.runtime.taskexecutor.exceptions.TaskSubmissionException: Could not submit task because there is no JobManager associated for the job bbe8634736b5b1d813dd322cfaaa08ea.
      	at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$submitTask$3(TaskExecutor.java:523) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at java.util.Optional.orElseThrow(Optional.java:290) ~[?:1.8.0_252]
      	at org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:514) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_252]
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_252]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_252]
      	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_252]
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.actor.Actor$class.aroundReceive(Actor.scala:517) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT]
      	... 9 more
      

      I will upload all logs to this ticket and post my initial analysis.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            trohrmann Till Rohrmann
            rmetzger Robert Metzger
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment