Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-33954

Large record may cause the hybrid shuffle hang

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Runtime / Network
    • None

    Description

      In some cases, the job may hang when there are not enough buffers in the local buffer pool. For instance, the parallelism is 4, so the HashBufferAccumulator is used. The size of the local buffer pool can be 5, and at some point, 3 of all buffers are required by 3 subpartitions and are not finished, so only 2 buffers are left. If a record that is larger than 2 buffers comes, the program would hang at requesting buffers.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Jiang Xin Jiang Xin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: