Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.19.0
-
None
Description
Multiple connect_1 stages fail due to a timeout:
Nov 09 02:09:33 "main" #1 prio=5 os_prio=0 tid=0x00007efd5400b800 nid=0x7c0e waiting on condition [0x00007efd5ccd8000] Nov 09 02:09:33 java.lang.Thread.State: WAITING (parking) Nov 09 02:09:33 at sun.misc.Unsafe.park(Native Method) Nov 09 02:09:33 - parking to wait for <0x00000000b762d130> (a java.util.concurrent.CompletableFuture$Signaller) Nov 09 02:09:33 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) Nov 09 02:09:33 at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) Nov 09 02:09:33 at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) Nov 09 02:09:33 at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) Nov 09 02:09:33 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:122) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:56) Nov 09 02:09:33 at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:178) Nov 09 02:09:33 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:375) Nov 09 02:09:33 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330) Nov 09 02:09:33 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:318) [...]
And
Nov 09 01:53:59 "main" #1 prio=5 os_prio=0 cpu=3732.81ms elapsed=1707.61s tid=0x00007f7bec028000 nid=0x3e5 waiting on condition [0x00007f7bf2c80000] Nov 09 01:53:59 java.lang.Thread.State: WAITING (parking) Nov 09 01:53:59 at jdk.internal.misc.Unsafe.park(java.base@11.0.19/Native Method) Nov 09 01:53:59 - parking to wait for <0x00000000aff7e730> (a java.util.concurrent.CompletableFuture$Signaller) Nov 09 01:53:59 at java.util.concurrent.locks.LockSupport.park(java.base@11.0.19/LockSupport.java:194) Nov 09 01:53:59 at java.util.concurrent.CompletableFuture$Signaller.block(java.base@11.0.19/CompletableFuture.java:1796) Nov 09 01:53:59 at java.util.concurrent.ForkJoinPool.managedBlock(java.base@11.0.19/ForkJoinPool.java:3128) Nov 09 01:53:59 at java.util.concurrent.CompletableFuture.waitingGet(java.base@11.0.19/CompletableFuture.java:1823) Nov 09 01:53:59 at java.util.concurrent.CompletableFuture.get(java.base@11.0.19/CompletableFuture.java:1998) Nov 09 01:53:59 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233) Nov 09 01:53:59 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223) Nov 09 01:53:59 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152) Nov 09 01:53:59 at org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:56) Nov 09 01:53:59 at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.persist(S3RecoverableFsDataOutputStream.java:167) Nov 09 01:53:59 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:351) Nov 09 01:53:59 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330) Nov 09 01:53:59 at org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverFromIntermWithoutAdditionalStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:312) [...]
Attachments
Issue Links
- duplicates
-
FLINK-33115 AbstractHadoopRecoverableWriterITCase is hanging with timeout on AZP
- Open