Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-12

Add max size of column dictionaries to ORC metadata

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      To predict the amount of memory required to read an ORC file we need to know the size of the dictionaries for the columns that we are reading. I propose adding the number of bytes for each column's dictionary to the stripe's column statistics. The file's column statistics would have the maximum dictionary size for each column.

      Attachments

        1. HIVE-9451.patch
          63 kB
          Owen O'Malley
        2. HIVE-9451.patch
          70 kB
          Owen O'Malley

        Activity

          People

            omalley Owen O'Malley
            omalley Owen O'Malley
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: