Hive
  1. Hive
  2. HIVE-2036

Update bitmap indexes for automatic usage

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: Indexing
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support. The bitmap code will need to be extended after it is committed to enable automatic use of indexing. Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query. There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently.

      1. HIVE-2036.1.patch
        159 kB
        Syed S. Albiz
      2. HIVE-2036.3.patch
        180 kB
        Syed S. Albiz
      3. HIVE-2036.8.patch
        424 kB
        Syed S. Albiz

        Issue Links

          Activity

          Hide
          Marquis Wang added a comment -

          Making notes on how to do this:

          One of the difficult/different parts about using bitmap indexes is that the only time they become useful is when multiple indexes are combined. Thus, you need a query that joins the various bitmap index tables and returns the blocks that contain the rows we want.

          Thus the two parts to writing the automatic use index handler for bitmap indexes are:

          1. Figuring out what indexes to use:

          As mentioned above, you may need to extend the IndexPredicateAnalyzer to support ORs and possibly to return a tree of predicates (I don't think it already does this).

          2. Building a query that accesses the index tables:

          This is an example query that I know works for querying the index tables in the query

          SELECT * FROM lineitem WHERE  L_QUANTITY = 50.0 AND L_DISCOUNT = 0.08 AND L_TAX = 0.01;
          
          SELECT bucketname AS `_bucketname`, COLLECT_SET(offset) as `_offsets`
          FROM (SELECT
                  `_bucketname` AS bucketname, `_offset` AS offset
                FROM
                  (SELECT ab.`_bucketname`, ab.`_offset`, EWAH_BITMAP_AND(ab.bitmap, c.`_bitmaps`) as bitmap FROM
                    (SELECT a.`_bucketname`, b.`_offset`, EWAH_BITMAP_AND(a.`_bitmaps`, b.`_bitmaps`) as bitmap FROM 
                      (SELECT * FROM default__lineitem_quantity__ WHERE L_QUANTITY = 50.0) a JOIN 
                      (SELECT * FROM default__lineitem_discount__ WHERE L_DISCOUNT = 0.08) b 
                          ON a.`_bucketname` = b.`_bucketname` AND a.`_offset` = b.`_offset`) ab JOIN
                        (SELECT * FROM default__lineitem_tax__ WHERE L_TAX = 0.01) c
                          ON ab.`_bucketname` = c.`_bucketname` AND ab.`_offset` = c.`_offset`) abc 
                WHERE 
                  NOT EWAH_BITMAP_EMPTY(abc.bitmap)
          ) t
          GROUP BY bucketname;
          

          This format is perfect for joining any number of AND predicates. I'm pretty sure you can figure out how to expand them to include OR predicates and different grounping of predicates as well. If you make any changes/extensions to the format you should be sure to test them to make sure they have the performance characteristics you want.

          Show
          Marquis Wang added a comment - Making notes on how to do this: One of the difficult/different parts about using bitmap indexes is that the only time they become useful is when multiple indexes are combined. Thus, you need a query that joins the various bitmap index tables and returns the blocks that contain the rows we want. Thus the two parts to writing the automatic use index handler for bitmap indexes are: 1. Figuring out what indexes to use: As mentioned above, you may need to extend the IndexPredicateAnalyzer to support ORs and possibly to return a tree of predicates (I don't think it already does this). 2. Building a query that accesses the index tables: This is an example query that I know works for querying the index tables in the query SELECT * FROM lineitem WHERE L_QUANTITY = 50.0 AND L_DISCOUNT = 0.08 AND L_TAX = 0.01; SELECT bucketname AS `_bucketname`, COLLECT_SET(offset) as `_offsets` FROM (SELECT `_bucketname` AS bucketname, `_offset` AS offset FROM (SELECT ab.`_bucketname`, ab.`_offset`, EWAH_BITMAP_AND(ab.bitmap, c.`_bitmaps`) as bitmap FROM (SELECT a.`_bucketname`, b.`_offset`, EWAH_BITMAP_AND(a.`_bitmaps`, b.`_bitmaps`) as bitmap FROM (SELECT * FROM default__lineitem_quantity__ WHERE L_QUANTITY = 50.0) a JOIN (SELECT * FROM default__lineitem_discount__ WHERE L_DISCOUNT = 0.08) b ON a.`_bucketname` = b.`_bucketname` AND a.`_offset` = b.`_offset`) ab JOIN (SELECT * FROM default__lineitem_tax__ WHERE L_TAX = 0.01) c ON ab.`_bucketname` = c.`_bucketname` AND ab.`_offset` = c.`_offset`) abc WHERE NOT EWAH_BITMAP_EMPTY(abc.bitmap) ) t GROUP BY bucketname; This format is perfect for joining any number of AND predicates. I'm pretty sure you can figure out how to expand them to include OR predicates and different grounping of predicates as well. If you make any changes/extensions to the format you should be sure to test them to make sure they have the performance characteristics you want.
          Hide
          Russell Melick added a comment -

          To expand a bit on Marquis' comments.

          In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate analyzer. My theory is that you're going to want a whole new PredicateAnalyzer class to deal with bitmaps, and then you'll instantiate it in a very similar way inside BitmapIndexHandler. You can also see here how we only search for columns on which we have indexes. This is going to need to be modified, since it currently only allows columns from a single index.

          You may also want to rewrite some of the logic in IndexWhereProcessor.process():110. It currently loops through every index available and asks it to do a rewrite. Perhaps it should loop through every index type and try to find the rewrites possible only using indexes of that type.

          If you look at IndexPredicateAnalyzer:123, you can see where it's making sure that all the parent operators are AND operations. It should be easy to modify this to allow OR operations, but I'm not sure that simply allowing them and using the current system will maintain logical correctness. It's probably better to start off with just AND's.

          The pushedPredicate is the important thing returned by the predicate analyzer. The pushed predicate is what it was able to recognize/process. That's the tree you'll want to use to generate the bitmap query. The residual predicate is what it couldn't process. There's a separate JIRA open (HIVE-2115) to use the residual to cut down on remaining work.

          The query generation lives in the IndexHandlers.generateIndexQuery(...). You'll definitely need more logic than the simple call to decomposedPredicate.pushedPredicate.getExprString() that is in the CompactIndexHandler.

          There are a few spots where hive.index.compact.file is used. These may need generalized. However, Marquis may have already taken care of this with the bitmap stuff. I don't remember what the new name for it was (I think it's hive.index.blockfilter.file), but it's probably easiest to look in one of his unit tests for it.

          The last thing I can think of is that having multiple index types on a single table, or queries that use multiple tables may become an issue. I created HIVE-2128 to deal with the multiple tables.

          Good luck!

          Show
          Russell Melick added a comment - To expand a bit on Marquis' comments. In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate analyzer. My theory is that you're going to want a whole new PredicateAnalyzer class to deal with bitmaps, and then you'll instantiate it in a very similar way inside BitmapIndexHandler. You can also see here how we only search for columns on which we have indexes. This is going to need to be modified, since it currently only allows columns from a single index. You may also want to rewrite some of the logic in IndexWhereProcessor.process():110. It currently loops through every index available and asks it to do a rewrite. Perhaps it should loop through every index type and try to find the rewrites possible only using indexes of that type. If you look at IndexPredicateAnalyzer:123, you can see where it's making sure that all the parent operators are AND operations. It should be easy to modify this to allow OR operations, but I'm not sure that simply allowing them and using the current system will maintain logical correctness. It's probably better to start off with just AND's. The pushedPredicate is the important thing returned by the predicate analyzer. The pushed predicate is what it was able to recognize/process. That's the tree you'll want to use to generate the bitmap query. The residual predicate is what it couldn't process. There's a separate JIRA open ( HIVE-2115 ) to use the residual to cut down on remaining work. The query generation lives in the IndexHandlers.generateIndexQuery(...). You'll definitely need more logic than the simple call to decomposedPredicate.pushedPredicate.getExprString() that is in the CompactIndexHandler. There are a few spots where hive.index.compact.file is used. These may need generalized. However, Marquis may have already taken care of this with the bitmap stuff. I don't remember what the new name for it was (I think it's hive.index.blockfilter.file), but it's probably easiest to look in one of his unit tests for it. The last thing I can think of is that having multiple index types on a single table, or queries that use multiple tables may become an issue. I created HIVE-2128 to deal with the multiple tables. Good luck!
          Hide
          Marquis Wang added a comment -

          Russell is right. hive.index.compact.file is deprecated and replaced with hive.index.blockfilter.file (I think). I kept the former around for backwards-compatibility reasons, but we should try to avoid using it.

          Show
          Marquis Wang added a comment - Russell is right. hive.index.compact.file is deprecated and replaced with hive.index.blockfilter.file (I think). I kept the former around for backwards-compatibility reasons, but we should try to avoid using it.
          Hide
          John Sichi added a comment -

          Yeah, starting off with just AND is probably already a good-sized chunk of work.

          Show
          John Sichi added a comment - Yeah, starting off with just AND is probably already a good-sized chunk of work.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          Review request for hive and John Sichi.

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          John Sichi added a comment -

          I've added some review board comments; I'll probably have some more once we've resolved these.

          Show
          John Sichi added a comment - I've added some review board comments; I'll probably have some more once we've resolved these.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review773
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1666>

          Update Javadoc and param name, including an explanation of what handler is supposed to do when multiple indexes are passed in.

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1675>

          I'm confused by the logic here. You are throwing together all of the columns for all of the indexes, but we need to keep them segregated, don't we? Each subquery should only contain references to the columns relevant to the corresponding index.

          (But the partitioning predicates need to be applied to each index.)

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
          <https://reviews.apache.org/r/857/#comment1668>

          Why is this public instead of private?

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
          <https://reviews.apache.org/r/857/#comment1667>

          Use HiveUtils.unparseIdentifier

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java
          <https://reviews.apache.org/r/857/#comment1669>

          Why do we need this class at all? The superclass already uses hive.index.blockfilter.file by default.

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
          <https://reviews.apache.org/r/857/#comment1672>

          Seems like we should only be looking at the indexes on the table accessed by this table scan. (This comment is retroactive to the original version of the file.)

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
          <https://reviews.apache.org/r/857/#comment1673>

          Seems like the costing comment below applies to this too.

          ql/src/test/queries/clientpositive/index_bitmap3.q
          <https://reviews.apache.org/r/857/#comment1670>

          Why do we need this setting at all? (I'm not sure why it was there in the original version of the file.)

          • John

          On 2011-06-06 21:37:38, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-06 21:37:38)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review773 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java < https://reviews.apache.org/r/857/#comment1666 > Update Javadoc and param name, including an explanation of what handler is supposed to do when multiple indexes are passed in. ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java < https://reviews.apache.org/r/857/#comment1675 > I'm confused by the logic here. You are throwing together all of the columns for all of the indexes, but we need to keep them segregated, don't we? Each subquery should only contain references to the columns relevant to the corresponding index. (But the partitioning predicates need to be applied to each index.) ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java < https://reviews.apache.org/r/857/#comment1668 > Why is this public instead of private? ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java < https://reviews.apache.org/r/857/#comment1667 > Use HiveUtils.unparseIdentifier ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java < https://reviews.apache.org/r/857/#comment1669 > Why do we need this class at all? The superclass already uses hive.index.blockfilter.file by default. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java < https://reviews.apache.org/r/857/#comment1672 > Seems like we should only be looking at the indexes on the table accessed by this table scan. (This comment is retroactive to the original version of the file.) ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java < https://reviews.apache.org/r/857/#comment1673 > Seems like the costing comment below applies to this too. ql/src/test/queries/clientpositive/index_bitmap3.q < https://reviews.apache.org/r/857/#comment1670 > Why do we need this setting at all? (I'm not sure why it was there in the original version of the file.) John On 2011-06-06 21:37:38, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-06 21:37:38) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_bitmap3.q 508eb94 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-06-07 18:30:15, John Sichi wrote:

          > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java, line 54

          > <https://reviews.apache.org/r/857/diff/1/?file=20596#file20596line54>

          >

          > Use HiveUtils.unparseIdentifier

          HiveUtils.unparseIdentifier is used on the argument passed in through to the constructor.

          On 2011-06-07 18:30:15, John Sichi wrote:

          > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java, line 25

          > <https://reviews.apache.org/r/857/diff/1/?file=20599#file20599line25>

          >

          > Why do we need this class at all? The superclass already uses hive.index.blockfilter.file by default.

          >

          removed in next diff

          • Syed

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review773
          -----------------------------------------------------------

          On 2011-06-08 00:22:37, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-08 00:22:37)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-06-07 18:30:15, John Sichi wrote: > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java, line 54 > < https://reviews.apache.org/r/857/diff/1/?file=20596#file20596line54 > > > Use HiveUtils.unparseIdentifier HiveUtils.unparseIdentifier is used on the argument passed in through to the constructor. On 2011-06-07 18:30:15, John Sichi wrote: > ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/HiveBitmapIndexInputFormat.java, line 25 > < https://reviews.apache.org/r/857/diff/1/?file=20599#file20599line25 > > > Why do we need this class at all? The superclass already uses hive.index.blockfilter.file by default. > removed in next diff Syed ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review773 ----------------------------------------------------------- On 2011-06-08 00:22:37, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-08 00:22:37) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-08 00:22:37.292935)

          Review request for hive and John Sichi.

          Changes
          -------

          Addressed comments. Still does not propagate partition predicates to every single index sub-query, but it does ensure that predicates are only applied to indexes for which there are matching columns. After looking at the behavior of CompactIndexHandler on partitioned tables (and in testcase index_auto_partitioned.q) I can't quite see how the CompactIndexHandler identifies and propagates partitioning predicates correctly.

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-08 00:22:37.292935) Review request for hive and John Sichi. Changes ------- Addressed comments. Still does not propagate partition predicates to every single index sub-query, but it does ensure that predicates are only applied to indexes for which there are matching columns. After looking at the behavior of CompactIndexHandler on partitioned tables (and in testcase index_auto_partitioned.q) I can't quite see how the CompactIndexHandler identifies and propagates partitioning predicates correctly. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          John Sichi added a comment -

          I'll take a look at the new patch tomorrow. index_auto_partitioned.q does not actually include a predicate on the partitioning column, so it should be enhanced to do that.

          The way it works for the compact index handler is that if we have a predicate like

          WHERE part_col = 1 AND index_col = 2 AND some_other_col = 3
          

          then it should generate

          WHERE part_col = 1 AND index_col = 2
          

          in the internal query against the index table. That's the reason that getIndexPredicateAnalyzer walks through all the partitions and adds the predicate columns via allowColumnName. (The way it does it isn't so great since it repeats it for each partition, when in fact one partition should be good enough.)

          Show
          John Sichi added a comment - I'll take a look at the new patch tomorrow. index_auto_partitioned.q does not actually include a predicate on the partitioning column, so it should be enhanced to do that. The way it works for the compact index handler is that if we have a predicate like WHERE part_col = 1 AND index_col = 2 AND some_other_col = 3 then it should generate WHERE part_col = 1 AND index_col = 2 in the internal query against the index table. That's the reason that getIndexPredicateAnalyzer walks through all the partitions and adds the predicate columns via allowColumnName. (The way it does it isn't so great since it repeats it for each partition, when in fact one partition should be good enough.)
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review782
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java
          <https://reviews.apache.org/r/857/#comment1680>

          It's preferable to apply the unparsing right at the point of SQL rendering.

          • John

          On 2011-06-08 00:22:37, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-08 00:22:37)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review782 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java < https://reviews.apache.org/r/857/#comment1680 > It's preferable to apply the unparsing right at the point of SQL rendering. John On 2011-06-08 00:22:37, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-08 00:22:37) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          John Sichi added a comment -

          Added a few new comments on review board.

          Show
          John Sichi added a comment - Added a few new comments on review board.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review785
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1681>

          I think that should be a period instead of a comma in "indexes, if"

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1682>

          How exactly are they combined? This Javadoc should be written as a contract between the optimizer and the index plugin author, so that the author knows exactly how to interpret the inputs and also what is going to be done with the output.

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1683>

          Why do you need to use toArray here? indexCols.keySet is already a collection.

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1684>

          Why are you converting the search conditions back into predicate form here? Wouldn't it be easier to analyze them as search conditions?

          • John

          On 2011-06-08 00:22:37, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-08 00:22:37)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review785 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java < https://reviews.apache.org/r/857/#comment1681 > I think that should be a period instead of a comma in "indexes, if" ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java < https://reviews.apache.org/r/857/#comment1682 > How exactly are they combined? This Javadoc should be written as a contract between the optimizer and the index plugin author, so that the author knows exactly how to interpret the inputs and also what is going to be done with the output. ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java < https://reviews.apache.org/r/857/#comment1683 > Why do you need to use toArray here? indexCols.keySet is already a collection. ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java < https://reviews.apache.org/r/857/#comment1684 > Why are you converting the search conditions back into predicate form here? Wouldn't it be easier to analyze them as search conditions? John On 2011-06-08 00:22:37, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-08 00:22:37) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-10 06:35:32.125295)

          Review request for hive and John Sichi.

          Changes
          -------

          Based on a discussion with yongqian, I re-implemented the predicate decomposition into two steps, computing the overall residual predicate from the union of all columns in the available indexes, and then computing the predicates to apply to each index individually. Additionally I have also extended the functionality to pass in partition columns to allowColumnNames and added/extended the testcases to check that partition predicates are propagated correctly. This required adding a check in IndexWhereProcessor.java that the correct FilterOperator was passed to the process(...) method (apparently a duplicate FilterOperator that does not have the entire predicate gets created).

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-10 06:35:32.125295) Review request for hive and John Sichi. Changes ------- Based on a discussion with yongqian, I re-implemented the predicate decomposition into two steps, computing the overall residual predicate from the union of all columns in the available indexes, and then computing the predicates to apply to each index individually. Additionally I have also extended the functionality to pass in partition columns to allowColumnNames and added/extended the testcases to check that partition predicates are propagated correctly. This required adding a check in IndexWhereProcessor.java that the correct FilterOperator was passed to the process(...) method (apparently a duplicate FilterOperator that does not have the entire predicate gets created). Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          Syed S. Albiz added a comment -

          This patch is still WIP, there are a couple of issues I know still need correcting. In particular, the index_auto_unused.q testcase fails, since I updated the partition predicates to propagate properly, there was no check to make sure that the index was built on the partition being queried (but the testcase would still pass since partition predicates weren't propagated anyway)

          I probably also want to refactor the logic in IndexWhereProcessor before this is ready.

          Show
          Syed S. Albiz added a comment - This patch is still WIP, there are a couple of issues I know still need correcting. In particular, the index_auto_unused.q testcase fails, since I updated the partition predicates to propagate properly, there was no check to make sure that the index was built on the partition being queried (but the testcase would still pass since partition predicates weren't propagated anyway) I probably also want to refactor the logic in IndexWhereProcessor before this is ready.
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-11 19:05:42.241706)

          Review request for hive and John Sichi.

          Changes
          -------

          Fix index_auto_unused.q testcase by adding a check for partitions in the index and ensuring that only partitions actually in the index are used to compute index predicates.

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-11 19:05:42.241706) Review request for hive and John Sichi. Changes ------- Fix index_auto_unused.q testcase by adding a check for partitions in the index and ensuring that only partitions actually in the index are used to compute index predicates. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review825
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
          <https://reviews.apache.org/r/857/#comment1790>

          I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.

          In other words, if the original query had

          part_key=<whatever>

          we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working?

          • John

          On 2011-06-11 19:05:42, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-11 19:05:42)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review825 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java < https://reviews.apache.org/r/857/#comment1790 > I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work. In other words, if the original query had part_key=<whatever> we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working? John On 2011-06-11 19:05:42, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-11 19:05:42) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review826
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1792>

          Don't bother with empty return statements.

          • John

          On 2011-06-11 19:05:42, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-11 19:05:42)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review826 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java < https://reviews.apache.org/r/857/#comment1792 > Don't bother with empty return statements. John On 2011-06-11 19:05:42, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-11 19:05:42) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-06-13 22:57:46, John Sichi wrote:

          > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114

          > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>

          >

          > I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.

          >

          > In other words, if the original query had

          >

          > part_key=<whatever>

          >

          > we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working?

          >

          This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.

          • Syed

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review825
          -----------------------------------------------------------

          On 2011-06-11 19:05:42, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-11 19:05:42)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-06-13 22:57:46, John Sichi wrote: > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114 > < https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114 > > > I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work. > > In other words, if the original query had > > part_key=<whatever> > > we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working? > This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea. Syed ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review825 ----------------------------------------------------------- On 2011-06-11 19:05:42, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-11 19:05:42) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-14 04:05:43.158797)

          Review request for hive and John Sichi.

          Changes
          -------

          Removed redundant check on partition predicate (which is done in IndexWhereProcessor). The reason this was causing problems was that when the index was being built, the query generated to build the index was run through the optimizer and at this stage the optimizer thought that the index was already built and had the partition. A simpler solution is to just disable index query optimization for building indexes.

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-14 04:05:43.158797) Review request for hive and John Sichi. Changes ------- Removed redundant check on partition predicate (which is done in IndexWhereProcessor). The reason this was causing problems was that when the index was being built, the query generated to build the index was run through the optimizer and at this stage the optimizer thought that the index was already built and had the partition. A simpler solution is to just disable index query optimization for building indexes. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          On 2011-06-13 22:57:46, John Sichi wrote:

          > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114

          > <https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114>

          >

          > I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work.

          >

          > In other words, if the original query had

          >

          > part_key=<whatever>

          >

          > we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working?

          >

          Syed Albiz wrote:

          This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea.

          The logic for making sure that the necessary index partitions exist is already present in IndexWhereProcessor.checkPartitionsCoveredByIndex. If that's not working, we should fix it; it should not be necessary to change the predicate analyzer at all.

          • John

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review825
          -----------------------------------------------------------

          On 2011-06-14 04:05:43, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-14 04:05:43)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - On 2011-06-13 22:57:46, John Sichi wrote: > ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java, line 114 > < https://reviews.apache.org/r/857/diff/4/?file=20984#file20984line114 > > > I don't think this should be necessary. We just want to propagate the partition column predicate (whatever it is) from the base table query to the index table query; partition pruning on the index table query will do the rest of the work. > > In other words, if the original query had > > part_key=<whatever> > > we want to preserve that on the index table query. That's what the code is already supposed to be doing before your change; was it not working? > Syed Albiz wrote: This code is to prevent automatic usage from kicking in if the index has not been built on the partition specified in the partition predicate. (i.e. if the index has only been built on partition ds=foo, and the query is select key from src where ds=bar; We do not want to execute an index query in this case. It seems like adding a test for bitmaps specifically to mirror index_auto_unused.q(which is where this functionality is tested for Compact indices) would be a good idea. The logic for making sure that the necessary index partitions exist is already present in IndexWhereProcessor.checkPartitionsCoveredByIndex. If that's not working, we should fix it; it should not be necessary to change the predicate analyzer at all. John ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review825 ----------------------------------------------------------- On 2011-06-14 04:05:43, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-14 04:05:43) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review836
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1806>

          Slight rephrasing suggested:

          "If multiple indexes are provided, it is up to handler to decide whether to use none, one, some, or all of them. The supplied predicate may reference any of the columns from any of the indexes. If the handler decides to use more than one index, then it is responsible for generating tasks to combine their search results (e.g. via a JOIN)."

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
          <https://reviews.apache.org/r/857/#comment1805>

          This should be gone.

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java
          <https://reviews.apache.org/r/857/#comment1807>

          Delete commented-out code, or convert it into a TODO with a corresponding JIRA issue link.

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
          <https://reviews.apache.org/r/857/#comment1808>

          Could you explain more about what's going on here?

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
          <https://reviews.apache.org/r/857/#comment1817>

          Only do indexes.get(0) once.

          • John

          On 2011-06-14 04:05:43, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-14 04:05:43)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review836 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java < https://reviews.apache.org/r/857/#comment1806 > Slight rephrasing suggested: "If multiple indexes are provided, it is up to handler to decide whether to use none, one, some, or all of them. The supplied predicate may reference any of the columns from any of the indexes. If the handler decides to use more than one index, then it is responsible for generating tasks to combine their search results (e.g. via a JOIN)." ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java < https://reviews.apache.org/r/857/#comment1805 > This should be gone. ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java < https://reviews.apache.org/r/857/#comment1807 > Delete commented-out code, or convert it into a TODO with a corresponding JIRA issue link. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java < https://reviews.apache.org/r/857/#comment1808 > Could you explain more about what's going on here? ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java < https://reviews.apache.org/r/857/#comment1817 > Only do indexes.get(0) once. John On 2011-06-14 04:05:43, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-14 04:05:43) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-14 21:26:21.276789)

          Review request for hive and John Sichi.

          Changes
          -------

          Addressed comments, added some more commenting for why we use indexes.get(0) in IndexWhereProcessor as that seemed a bit unclear

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-14 21:26:21.276789) Review request for hive and John Sichi. Changes ------- Addressed comments, added some more commenting for why we use indexes.get(0) in IndexWhereProcessor as that seemed a bit unclear Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-15 23:46:24.176586)

          Review request for hive and John Sichi.

          Changes
          -------

          Used setFilterExpr on the TableScanDesc to propagate the complete original predicate as the partition predicate was getting removed by the PartitionConditionRemover from the FilterOperator

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73
          ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto.q.out 713bb40
          ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556
          ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5
          ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-15 23:46:24.176586) Review request for hive and John Sichi. Changes ------- Used setFilterExpr on the TableScanDesc to propagate the complete original predicate as the partition predicate was getting removed by the PartitionConditionRemover from the FilterOperator Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto.q.out 713bb40 ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/#review856
          -----------------------------------------------------------

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
          <https://reviews.apache.org/r/857/#comment1865>

          Need to update this comment now, explaining why we don't even look for the filter operator any more.

          • John

          On 2011-06-15 23:46:24, Syed Albiz wrote:

          -----------------------------------------------------------

          This is an automatically generated e-mail. To reply, visit:

          https://reviews.apache.org/r/857/

          -----------------------------------------------------------

          (Updated 2011-06-15 23:46:24)

          Review request for hive and John Sichi.

          Summary

          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.

          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs

          -----

          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845

          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183

          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION

          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609

          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d

          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a

          ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73

          ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b

          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04

          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION

          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION

          ql/src/test/results/clientpositive/index_auto.q.out 713bb40

          ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556

          ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc

          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a

          ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5

          ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77

          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION

          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing

          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/#review856 ----------------------------------------------------------- ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java < https://reviews.apache.org/r/857/#comment1865 > Need to update this comment now, explaining why we don't even look for the filter operator any more. John On 2011-06-15 23:46:24, Syed Albiz wrote: ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-15 23:46:24) Review request for hive and John Sichi. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto.q.out 713bb40 ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          jiraposter@reviews.apache.org added a comment -

          -----------------------------------------------------------
          This is an automatically generated e-mail. To reply, visit:
          https://reviews.apache.org/r/857/
          -----------------------------------------------------------

          (Updated 2011-06-17 22:34:18.950303)

          Review request for hive and John Sichi.

          Changes
          -------

          added comments, only push filter expr into TS operator when automatic indexing is turned on.

          Summary
          -------

          Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster.

          This addresses bug HIVE-2036.
          https://issues.apache.org/jira/browse/HIVE-2036

          Diffs (updated)


          ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845
          ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183
          ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION
          ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609
          ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d
          ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a
          ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73
          ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b
          ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04
          ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION
          ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION
          ql/src/test/results/clientpositive/index_auto.q.out 713bb40
          ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556
          ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc
          ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a
          ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5
          ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77
          ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION
          ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION

          Diff: https://reviews.apache.org/r/857/diff

          Testing
          -------

          Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster.

          Thanks,

          Syed

          Show
          jiraposter@reviews.apache.org added a comment - ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/857/ ----------------------------------------------------------- (Updated 2011-06-17 22:34:18.950303) Review request for hive and John Sichi. Changes ------- added comments, only push filter expr into TS operator when automatic indexing is turned on. Summary ------- Add support for generating index queries to support automatic usage of bitmap indexes. This required changing the interface to the IndexHandlers to support accepting queries on multiple indexes. The compact indexes were modified to use this new interface as well, although no functional changes were made to how they work. Only supports AND predicates right now, but it should be possibly to extend the BitmapQuery interface defined in this patch to easily support OR predicates as well. Currently benchmarking these changes on a test cluster. This addresses bug HIVE-2036 . https://issues.apache.org/jira/browse/HIVE-2036 Diffs (updated) ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 4fba845 ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java e5ee183 ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java 3caa4cc ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java af9d7b1 ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapInnerQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapOuterQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapQuery.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 56e7609 ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java d64e88b ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java 268560d ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java 0873e1a ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 95fef73 ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java d22654b ql/src/test/queries/clientpositive/index_auto_partitioned.q 5f92f04 ql/src/test/queries/clientpositive/index_bitmap_auto.q PRE-CREATION ql/src/test/queries/clientpositive/index_bitmap_auto_partitioned.q PRE-CREATION ql/src/test/results/clientpositive/index_auto.q.out 713bb40 ql/src/test/results/clientpositive/index_auto_file_format.q.out 894a556 ql/src/test/results/clientpositive/index_auto_multiple.q.out 27092dc ql/src/test/results/clientpositive/index_auto_partitioned.q.out 05cc84a ql/src/test/results/clientpositive/index_auto_unused.q.out 8a1eda5 ql/src/test/results/clientpositive/index_bitmap3.q.out dadfa77 ql/src/test/results/clientpositive/index_bitmap_auto.q.out PRE-CREATION ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out PRE-CREATION Diff: https://reviews.apache.org/r/857/diff Testing ------- Passes unit tests, additional testcase to test automatic bitmap indexing index_bitmap_auto.q was also added to the TestCliDriver suite. Currently benchmarking changes on a test cluster. Thanks, Syed
          Hide
          John Sichi added a comment -

          +1. Will commit when tests pass.

          Show
          John Sichi added a comment - +1. Will commit when tests pass.
          Hide
          John Sichi added a comment -

          I mean, once the latest patch gets uploaded.

          Show
          John Sichi added a comment - I mean, once the latest patch gets uploaded.
          Hide
          John Sichi added a comment -

          Committed. Thanks Syed!

          Show
          John Sichi added a comment - Committed. Thanks Syed!
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #790 (See https://builds.apache.org/job/Hive-trunk-h0.21/790/)

          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #790 (See https://builds.apache.org/job/Hive-trunk-h0.21/790/ )

            People

            • Assignee:
              Syed S. Albiz
              Reporter:
              Russell Melick
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development