Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-33574

testRecoverAfterMultiplePersistsStateWithMultiPart andtestRecoverAfterMultiplePersistsStateWithMultiPart run into timeouts

    XMLWordPrintableJSON

Details

    Description

      https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54446&view=logs&j=4eda0b4a-bd0d-521a-0916-8285b9be9bb5&t=2ff6d5fa-53a6-53ac-bff7-fa524ea361a9

      Multiple connect_1 stages fail due to a timeout:

       Nov 09 02:09:33 "main" #1 prio=5 os_prio=0 tid=0x00007efd5400b800 nid=0x7c0e waiting on condition [0x00007efd5ccd8000]
      Nov 09 02:09:33    java.lang.Thread.State: WAITING (parking)
      Nov 09 02:09:33 	at sun.misc.Unsafe.park(Native Method)
      Nov 09 02:09:33 	- parking to wait for  <0x00000000b762d130> (a java.util.concurrent.CompletableFuture$Signaller)
      Nov 09 02:09:33 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      Nov 09 02:09:33 	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
      Nov 09 02:09:33 	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
      Nov 09 02:09:33 	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
      Nov 09 02:09:33 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:122)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:56)
      Nov 09 02:09:33 	at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:178)
      Nov 09 02:09:33 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:375)
      Nov 09 02:09:33 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330)
      Nov 09 02:09:33 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:318)
      
      [...]

      And

       Nov 09 01:53:59 "main" #1 prio=5 os_prio=0 cpu=3732.81ms elapsed=1707.61s tid=0x00007f7bec028000 nid=0x3e5 waiting on condition  [0x00007f7bf2c80000]
      Nov 09 01:53:59    java.lang.Thread.State: WAITING (parking)
      Nov 09 01:53:59 	at jdk.internal.misc.Unsafe.park(java.base@11.0.19/Native Method)
      Nov 09 01:53:59 	- parking to wait for  <0x00000000aff7e730> (a java.util.concurrent.CompletableFuture$Signaller)
      Nov 09 01:53:59 	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.19/LockSupport.java:194)
      Nov 09 01:53:59 	at java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.19/CompletableFuture.java:1796)
      Nov 09 01:53:59 	at java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.19/ForkJoinPool.java:3128)
      Nov 09 01:53:59 	at java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.19/CompletableFuture.java:1823)
      Nov 09 01:53:59 	at java.util.concurrent.CompletableFuture.get(java.base@11.0.19/CompletableFuture.java:1998)
      Nov 09 01:53:59 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233)
      Nov 09 01:53:59 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223)
      Nov 09 01:53:59 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152)
      Nov 09 01:53:59 	at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:56)
      Nov 09 01:53:59 	at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.persist(S3RecoverableFsDataOutputStream.java:167)
      Nov 09 01:53:59 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:351)
      Nov 09 01:53:59 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330)
      Nov 09 01:53:59 	at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverFromIntermWithoutAdditionalStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:312)
      
      [...]

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mapohl Matthias Pohl
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: