Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-15140

Shuffle data compression does not work with BroadcastRecordWriter.

    XMLWordPrintableJSON

Details

    Description

      I tested the newest code of master branch last weekend with more test cases. Unfortunately, several problems were encountered, including a bug of compression.

      When BroadcastRecordWriter is used, for pipelined mode, because the compressor copies the data back to the input buffer, however, the underlying buffer is shared when BroadcastRecordWriter is used. So we can not copy the compressed buffer back to the input buffer if the underlying buffer is shared. For blocking mode, we wrongly recycle the buffer when buffer is not compressed, and the problem is also triggered when BroadcastRecordWriter is used.

      To fix the problem, for blocking shuffle, the reference counter should be maintained correctly, for pipelined shuffle, the simplest way maybe disable compression when the underlying buffer is shared. I will open a PR to fix the problem.

      Attachments

        Activity

          People

            kevin.cyj Yingjie Cao
            kevin.cyj Yingjie Cao
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m