Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-11631

TaskExecutorITCase#testJobReExecutionAfterTaskExecutorTermination unstable on Travis

    XMLWordPrintableJSON

Details

    Description

      The TaskExecutorITCase#testJobReExecutionAfterTaskExecutorTermination is unstable on Travis. It fails with

      16:12:04.644 [ERROR] testJobReExecutionAfterTaskExecutorTermination(org.apache.flink.runtime.taskexecutor.TaskExecutorITCase)  Time elapsed: 1.257 s  <<< ERROR!
      org.apache.flink.util.FlinkException: Could not close resource.
      	at org.apache.flink.runtime.taskexecutor.TaskExecutorITCase.teardown(TaskExecutorITCase.java:83)
      Caused by: org.apache.flink.util.FlinkException: Error while shutting the TaskExecutor down.
      Caused by: org.apache.flink.util.FlinkException: Could not properly shut down the TaskManager services.
      Caused by: java.lang.IllegalStateException: NetworkBufferPool is not empty after destroying all LocalBufferPools
      

      https://api.travis-ci.org/v3/job/493221318/log.txt

      The problem seems to be caused by the TaskExecutor not properly waiting for the termination of all running Tasks. Due to this, there is a race condition which causes that not all buffers are returned to the BufferPool.

      Attachments

        Issue Links

          Activity

            People

              SleePy Biao Liu
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h