Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-14199

Enable Bucket Pruning for ACID tables

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.3.0
    • Transactions
    • None
    • Reviewed

    Description

      Currently, ACID tables do not benefit from the bucket pruning feature introduced in HIVE-11525. The reason for this has been the fact that bucket pruning happens at split generation level and for ACID, traditionally the delta files were never split. The parallelism for ACID was then restricted to the number of buckets. There would be as many splits as the number of buckets and each worker processing one split would inevitably read all the delta files for that bucket, even when the query may have originally required only one of the buckets to be read.
      However, HIVE-14035 now enables even the delta files to be also split. What this means is that now we have enough information at the split generation level to determine appropriate buckets to process for the delta files. This can efficiently allow us to prune unnecessary buckets for delta files and will lead to good performance gain for a large number of selective queries on ACID tables.

      Attachments

        1. HIVE-14199.01.patch
          11 kB
          Saket Saurabh
        2. HIVE-14199.02.patch
          10 kB
          Saket Saurabh
        3. HIVE-14199.03.patch
          10 kB
          Saket Saurabh

        Issue Links

          Activity

            People

              saketj Saket Saurabh
              saketj Saket Saurabh
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: