Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-31675

Deadlock in AWS Connectors following content-length AWS SDK exception

    XMLWordPrintableJSON

Details

    Description

      Connector calls to AWS services can hang on a canceled future following a content-length mismatch that isn't handled gracefully by the SDK:

       

      org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.FutureCancelledException: java.io.IOException: Response had content-length of 31 bytes, but only received 0 bytes before the connection was closed.
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.lambda$null$3(NettyRequestExecutor.java:136)
      at org.apache.flink.kinesis.shaded.io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
      at org.apache.flink.kinesis.shaded.io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
      at org.apache.flink.kinesis.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
      at org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
      at org.apache.flink.kinesis.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500)
      at org.apache.flink.kinesis.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
      at org.apache.flink.kinesis.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
      at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: java.io.IOException: Response had content-length of 31 bytes, but only received 0 bytes before the connection was closed.
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.validateResponseContentLength(ResponseHandler.java:163)
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.ResponseHandler.access$700(ResponseHandler.java:75)
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.ResponseHandler$PublisherAdapter$1.onComplete(ResponseHandler.java:369)
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.complete(HandlerPublisher.java:447)
      at org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.nrs.HandlerPublisher.channelInactive(HandlerPublisher.java:430)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
      at org.apache.flink.kinesis.shaded.io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81)
      at org.apache.flink.kinesis.shaded.io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
      at org.apache.flink.kinesis.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
      at org.apache.flink.kinesis.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
      at org.apache.flink.kinesis.shaded.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
      at org.apache.flink.kinesis.shaded.io.netty.handler.codec.http2.AbstractHttp2StreamChannel$Http2ChannelUnsafe$2.run(AbstractHttp2StreamChannel.java:737)
      ... 6 more 

      Related AWS SDK issue: https://github.com/aws/aws-sdk-java-v2/issues/3335

      AWS SDK fix: https://github.com/aws/aws-sdk-java-v2/pull/3855/files

       

      This mishandled exception creates a deadlock situation that prevents the connectors from making any progress.

      We should update the AWS SDK v2 to 2.20.32: https://github.com/aws/aws-sdk-java-v2/commit/eb5619e24e4eaca6f80effa1c43c0cd409cdd53e

      Attachments

        Activity

          People

            antoniovespoli Antonio Vespoli
            antoniovespoli Antonio Vespoli
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: