Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4160 Vectorized Query Execution in Hive
  3. HIVE-4727

Optimize ORC StringTreeReader::nextVector to not create dictionary of strings for each call to nextVector

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: vectorization-branch, 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently ORC StringTreeReader::nextVector creates dictionary of strings for each call to nextVector. This leads to bad perf as there is huge memory allocation and deallocation on each call. Since the dictionary does not change within a stripe, StringTreeReader::nextVector should be optimized to create this dictionary only on stripe read.

        Attachments

        1. Hive-4727.0.patch
          3 kB
          Sarvesh Sakalanaga

          Activity

            People

            • Assignee:
              sarvesh.sn Sarvesh Sakalanaga
              Reporter:
              sarvesh.sn Sarvesh Sakalanaga
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: