Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-4244 Make string dictionaries adaptive in ORC
  3. HIVE-4324

ORC Turn off dictionary encoding when number of distinct keys is greater than threshold

Log workAgile BoardRank to TopRank to BottomVotersWatch issueWatchersConvert to IssueMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.11.0
    • 0.12.0
    • File Formats
    • None

    Description

      Add a configurable threshold so that if the number of distinct values in a string column is greater than that fraction of non-null values, dictionary encoding is turned off.

      Attachments

        1. HIVE-4324.D12045.3.patch
          39 kB
          Phabricator
        2. HIVE-4324.D12045.2.patch
          39 kB
          Phabricator
        3. HIVE-4324.D12045.2.patch
          39 kB
          Owen O'Malley
        4. HIVE-4324.D12045.1.patch
          38 kB
          Phabricator
        5. HIVE-4324.1.patch.txt
          73 kB
          Kevin Wilfong

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kevinwilfong Kevin Wilfong Assign to me
            kevinwilfong Kevin Wilfong
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment