Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4160 Vectorized Query Execution in Hive
  3. HIVE-4727

Optimize ORC StringTreeReader::nextVector to not create dictionary of strings for each call to nextVector

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • vectorization-branch, 0.13.0
    • None
    • None

    Description

      Currently ORC StringTreeReader::nextVector creates dictionary of strings for each call to nextVector. This leads to bad perf as there is huge memory allocation and deallocation on each call. Since the dictionary does not change within a stripe, StringTreeReader::nextVector should be optimized to create this dictionary only on stripe read.

      Attachments

        1. Hive-4727.0.patch
          3 kB
          Sarvesh Sakalanaga

        Activity

          People

            sarvesh.sn Sarvesh Sakalanaga
            sarvesh.sn Sarvesh Sakalanaga
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: