Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-12

Add max size of column dictionaries to ORC metadata

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      To predict the amount of memory required to read an ORC file we need to know the size of the dictionaries for the columns that we are reading. I propose adding the number of bytes for each column's dictionary to the stripe's column statistics. The file's column statistics would have the maximum dictionary size for each column.

        Attachments

        1. HIVE-9451.patch
          63 kB
          Owen O'Malley
        2. HIVE-9451.patch
          70 kB
          Owen O'Malley

          Activity

            People

            • Assignee:
              omalley Owen O'Malley
              Reporter:
              omalley Owen O'Malley
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: