Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5831

filter input files for bucketed tables

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • Query Processor
    • None

    Description

      When the users query a bucketed table and use the bucketed column in the predicate, only the buckets that satisfy the predicate need to be scanned, thus improving the performance.
      Given a table test:
      CREATE TABLE test (x INT, y STRING) CLUSTERED BY ( x ) INTO 10 BUCKETS;
      The following query only has to scan bucket 5:
      SELECT * FROM test WHERE x=5;

      Attachments

        1. hive-5831.patch
          35 kB
          Rui Li

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lirui Rui Li
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: