Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9188

BloomFilter support in ORC

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.15.0
    • 1.2.0
    • File Formats
    • Support for Bloom Filters in ORC internal index.

    Description

      BloomFilters are well known probabilistic data structure for set membership checking. We can use bloom filters in ORC index for better row group pruning. Currently, ORC row group index uses min/max statistics to eliminate row groups (stripes as well) that do not satisfy predicate condition specified in the query. But in some cases, the efficiency of min/max based elimination is not optimal (unsorted columns with wide range of entries). Bloom filters can be an effective and efficient alternative for row group/split elimination for point queries or queries with IN clause.

      Attachments

        1. HIVE-9188.9.patch
          298 kB
          Prasanth Jayachandran
        2. HIVE-9188.8.patch
          298 kB
          Prasanth Jayachandran
        3. HIVE-9188.7.patch
          298 kB
          Prasanth Jayachandran
        4. HIVE-9188.6.patch
          302 kB
          Prasanth Jayachandran
        5. HIVE-9188.5.patch
          296 kB
          Prasanth Jayachandran
        6. HIVE-9188.4.patch
          192 kB
          Prasanth Jayachandran
        7. HIVE-9188.3.patch
          136 kB
          Prasanth Jayachandran
        8. HIVE-9188.2.patch
          135 kB
          Prasanth Jayachandran
        9. HIVE-9188.11.patch
          317 kB
          Prasanth Jayachandran
        10. HIVE-9188.10.patch
          315 kB
          Prasanth Jayachandran
        11. HIVE-9188.1.patch
          124 kB
          Prasanth Jayachandran

        Issue Links

          Activity

            People

              prasanth_j Prasanth Jayachandran
              prasanth_j Prasanth Jayachandran
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: