Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5347

Parquet scanner has a lot of small CPU inefficiencies

    Details

      Description

      I spent some time looking at the parquet scanner in perf top. There are a lot of cases where the code is inefficient in ways that are easily fixed. Together this could add up to a significant perf win for scans.

      The assembly of the core MaterializeValueBatch() loop has a lot of obvious inefficiency:

      • Many loads from memory of values that are constant within the loop
      • The generated bit unpacking and dictionary decoding code has a lot of inefficiency, e.g. a complicated bounds check
      • Hot functions like DictDecoder::Get() are not inlined.

      A lot of time is also spent on some scans calling memset() on one or two bytes inside InitTuple().

        Attachments

          Activity

            People

            • Assignee:
              tarmstrong Tim Armstrong
              Reporter:
              tarmstrong Tim Armstrong
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: