Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-11320

ACID enable predicate pushdown for insert-only delta file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.3.0
    • Transactions
    • None

    Description

      Given ACID table T against which some Insert/Update/Delete has been executed but not Major Compaction.

      This table will have some number of delta files. (and possibly base files).

      Given a query: select * from T where c1 = 5;

      OrcRawRecordMerger() c'tor currently disables predicate pushdown in ORC to the delta file via eventOptions.searchArgument(null, null);

      When a delta file is known to only have Insert events we can safely push the predicate.

      ORC maintains stats in a footer which have counts of insert/update/delete events in the file - this can be used to determine that a given delta file only has Insert events.

      See OrcRecordUpdate.parseAcidStats()

      This will enable PPD for Streaming Ingest (HIVE-5687) use cases which by definition only generate Insert events.

      PPD for deltas with arbitrary types of events can be achieved but it is more complicated and will be addressed separately.

      Attachments

        1. HIVE-11320.patch
          7 kB
          Eugene Koifman

        Issue Links

          Activity

            People

              ekoifman Eugene Koifman
              ekoifman Eugene Koifman
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: