Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-19426

Streaming File Sink end-to-end test sometimes fails with "Could not assign resource ... to current execution ..."

    XMLWordPrintableJSON

Details

    Description

      https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=6983&view=logs&j=68a897ab-3047-5660-245a-cce8f83859f6&t=16ca2cca-2f63-5cce-12d2-d519b930a729

      2020-09-26T22:16:26.9856525Z org.apache.flink.runtime.io.network.partition.consumer.PartitionConnectionException: Connection for partition 619775973ed0f282e20f9d55d13913ab#0@bc764cd8ddf7a0cff126f51c16239658_0_1 not reachable.
      2020-09-26T22:16:26.9857848Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:159) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9859168Z 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.internalRequestPartitions(SingleInputGate.java:336) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9860449Z 	at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:308) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9861677Z 	at org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:95) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9862861Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.requestPartitions(StreamTask.java:542) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9864018Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.readRecoveredChannelState(StreamTask.java:507) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9865284Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:498) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9866415Z 	at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9867500Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:492) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9868514Z 	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:550) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9869450Z 	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9870339Z 	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) [flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9870869Z 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
      2020-09-26T22:16:26.9872060Z Caused by: java.io.IOException: java.util.concurrent.ExecutionException: org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connecting to remote task manager '/10.1.0.4:38905' has failed. This might indicate that the remote task manager has been lost.
      2020-09-26T22:16:26.9873511Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:85) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9874788Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9876084Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9876567Z 	... 12 more
      2020-09-26T22:16:26.9877477Z Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connecting to remote task manager '/10.1.0.4:38905' has failed. This might indicate that the remote task manager has been lost.
      2020-09-26T22:16:26.9878503Z 	at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_265]
      2020-09-26T22:16:26.9879061Z 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) ~[?:1.8.0_265]
      2020-09-26T22:16:26.9880244Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:83) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9884461Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9885737Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9886304Z 	... 12 more
      2020-09-26T22:16:26.9887211Z Caused by: org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connecting to remote task manager '/10.1.0.4:38905' has failed. This might indicate that the remote task manager has been lost.
      2020-09-26T22:16:26.9888456Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:122) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9889704Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:101) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9891028Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.lambda$createPartitionRequestClient$1(PartitionRequestClientFactory.java:78) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9892193Z 	at org.apache.flink.runtime.concurrent.FutureUtils.completeFromCallable(FutureUtils.java:87) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9893396Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:78) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9894646Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9895718Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9896201Z 	... 12 more
      2020-09-26T22:16:26.9896424Z Caused by: java.lang.NullPointerException
      2020-09-26T22:16:26.9897066Z 	at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:58) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9898008Z 	at org.apache.flink.runtime.io.network.netty.NettyPartitionRequestClient.<init>(NettyPartitionRequestClient.java:73) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9899040Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:116) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9900118Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connectWithRetries(PartitionRequestClientFactory.java:101) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9901443Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.lambda$createPartitionRequestClient$1(PartitionRequestClientFactory.java:78) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9902613Z 	at org.apache.flink.runtime.concurrent.FutureUtils.completeFromCallable(FutureUtils.java:87) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9904043Z 	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:78) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9905404Z 	at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9906893Z 	at org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:156) ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
      2020-09-26T22:16:26.9907510Z 	... 12 more
      

      Attachments

        Issue Links

          Activity

            People

              rmetzger Robert Metzger
              dian.fu Dian Fu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: