Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22980

Support custom path filter for ORC tables



    • Type: New Feature
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ORC
    • Labels:


      The customer is looking for an option to specify custom path filter for ORC tables. Please find the details below from customer requirement.

      Problem Statement/Approach in customer words :

      Currently, Orc file input format does not take in path filters set in the property "mapreduce.input.pathfilter.class" OR " mapred.input.pathfilter.class ". So, we cannot use custom filters with Orc files.

      AcidUtils class has a static filter called "hiddenFilters" which is used by ORC to filter input paths. If we can pass the custom filter classes(set in the property mentioned above) to AcidUtils and replace hiddenFilter with a filter that does an "and" operation over hiddenFilter+customFilters, the filters would work well.

      On local testing, mapreduce.input.pathfilter.class seems to be working for Text tables but not for ORC tables.

      Our analysis:

      OrcInputFormat and FileInputFormat are different implementations for Inputformat interface. Property "mapreduce.input.pathfilter.class" is only respected by FileInputFormat, but not by any other implementations of InputFormat. The customer wants to have the ability to filter out rows based on path/filenames, current ORC features like bloomfilters and indexes are not good enough for them to minimize number of disk read operations.


        1. HIVE-22980.1.patch
          8 kB
          Oleksiy Sayankin
        2. HIVE-22980.2.patch
          8 kB
          Oleksiy Sayankin
        3. HIVE-22980.3.patch
          8 kB
          Oleksiy Sayankin

          Issue Links



              • Assignee:
                osayankin Oleksiy Sayankin
                osayankin Oleksiy Sayankin
              • Votes:
                0 Vote for this issue
                1 Start watching this issue


                • Created: