Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-34567

flink task manager error occur, msg: Encountered error while consuming partitions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.16.2
    • None
    • None

    Description

      I deploy flink cluster (version: 1.16.2) and it run normally about 2 months, but recently i meet a problem. I see some sub tasks back pressure is high and the flink job is totally blocked(in pic1.jpg), these sub tasks are all in one task manager. so i stop the abnormal task manager and deploy flink job again, the problem is solved. I find some error log in the abnormal task manager:

      2024-03-03 15:57:25,088 ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue [] - Encountered error while consuming partitions
      org.apache.flink.shaded.netty4.io.netty.channel.unix.Errors$NativeIoException: readAddress(..) failed: Connection timed out

      I check the abnormal task manager deployed machine. cpu, memory, network is as normal as other task manager deployed machine, so it doesn't look like a hardware problem.

      What does it mean?

      What should i do to solve this problem completely?

      Attachments

        1. pic1.jpg
          202 kB
          yamanda

        Activity

          People

            Unassigned Unassigned
            yamanda yamanda
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 96h
                96h
                Remaining:
                Remaining Estimate - 96h
                96h
                Logged:
                Time Spent - Not Specified
                Not Specified