Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4244 Make string dictionaries adaptive in ORC
  3. HIVE-4324

ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.11.0
    • Fix Version/s: 0.12.0
    • Component/s: File Formats
    • Labels:
      None

      Description

      Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off.

        Attachments

        1. HIVE-4324.1.patch.txt
          73 kB
          Kevin Wilfong
        2. HIVE-4324.D12045.1.patch
          38 kB
          Phabricator
        3. HIVE-4324.D12045.2.patch
          39 kB
          Owen O'Malley
        4. HIVE-4324.D12045.2.patch
          39 kB
          Phabricator
        5. HIVE-4324.D12045.3.patch
          39 kB
          Phabricator

          Activity

            People

            • Assignee:
              kevinwilfong Kevin Wilfong
              Reporter:
              kevinwilfong Kevin Wilfong
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: