Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4860 Sort performance
  3. FLINK-4705

Instrument FixedLengthRecordSorter

    XMLWordPrintableJSON

Details

    Description

      The NormalizedKeySorter sorts on the concatenation of (potentially partial) keys plus an 8-byte pointer to the record. After sorting each pointer must be dereferenced, which is not cache friendly.

      The FixedLengthRecordSorter sorts on the concatentation of full keys followed by the remainder of the record. The records can then be deserialized in sequence.

      Instrumenting the FixedLengthRecordSorter requires implementing the comparator methods writereadWithKeyNormalization and readWithKeyNormalization.

      Testing JaccardIndex on an m4.16xlarge the scale 18 runtime dropped from 71.8 to 68.8 s (4.3% faster) and the scale 20 runtime dropped from 546.1 to 501.8 s (8.8% faster).

      Attachments

        Activity

          People

            Unassigned Unassigned
            greghogan Greg Hogan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 10m
                10m