Details

    • Hadoop Flags:
      Reviewed

      Description

      For partition predicates that cannot be pushed down to JDO filtering (HIVE-2049), we should fall back to the old approach of listing all partition names first and use Hive's expression evaluation engine to select the correct partitions. Then the partition pruner should hand Hive a list of partition names and return a list of Partition Object (this should be added to the Hive API).

      A possible optimization is that the the partition pruner should give Hive a set of ranges of partition names (say [ts=01, ts=11], [ts=20, ts=24]), and the JDO query should be formulated as range queries. Range queries are possible because the first step list all partition names in sorted order. It's easy to come up with a range and it is guaranteed that the JDO range query results should be equivalent to the query with a list of partition names.

      1. HIVE-2050.patch
        161 kB
        Ning Zhang
      2. HIVE-2050.4.patch
        166 kB
        Ning Zhang
      3. HIVE-2050.3.patch
        166 kB
        Ning Zhang
      4. HIVE-2050.2.patch
        170 kB
        Ning Zhang

        Activity

        Hide
        Ning Zhang added a comment -

        Uploading a new patch for review. Still running tests. The review board request: https://reviews.apache.org/r/522/

        Show
        Ning Zhang added a comment - Uploading a new patch for review. Still running tests. The review board request: https://reviews.apache.org/r/522/
        Hide
        Ning Zhang added a comment -

        passed all unit tests.

        Show
        Ning Zhang added a comment - passed all unit tests.
        Hide
        Ning Zhang added a comment -

        Note that this patch implements a simple API that passes a list of partition names rather than a range of partition names. My performance testing indicates that bottleneck is not in the JDO query itself. The JDO queries that getting the list of all MPartitions takes about 5 secs for a list of 20k partitions. However converting these 20k MPartitions to Partitions took about 3 mins. Committing the transaction took another 3 mins.

        Note that converting MPartitions to Partitions and committing transactions are common operations. Even though we use JDO pushdown (HIVE-2048) or use range queries, these costs are still there. We need to optimize these costs away in the next step.

        Show
        Ning Zhang added a comment - Note that this patch implements a simple API that passes a list of partition names rather than a range of partition names. My performance testing indicates that bottleneck is not in the JDO query itself. The JDO queries that getting the list of all MPartitions takes about 5 secs for a list of 20k partitions. However converting these 20k MPartitions to Partitions took about 3 mins. Committing the transaction took another 3 mins. Note that converting MPartitions to Partitions and committing transactions are common operations. Even though we use JDO pushdown ( HIVE-2048 ) or use range queries, these costs are still there. We need to optimize these costs away in the next step.
        Hide
        Namit Jain added a comment -

        Based on an offline review, this may increase memory, we need to return the
        partition names periodically to put a memory bound

        Show
        Namit Jain added a comment - Based on an offline review, this may increase memory, we need to return the partition names periodically to put a memory bound
        Hide
        Ning Zhang added a comment -

        There are 2 major changes from the last patch:

        • added a parameter hive.metastore.batch.retrieve.max to control the maximum number of partitions can be retrieved from the metastore in one batch (default 300). In Hive.getPartitionsByNames(), the input partition name list are separated into sublists and call the metastore API for each sublist.
        • one of the most time consuming DB operations is the retrieve the sub-classes of MPartition. In particular the list of FieldSchema are retrieved for each partition and they are never used (the table's field schema is used for all partitions). So one of the changes here is to omit the retrieval of FieldSchema and make the table's fieldschema as the partitions. If later we need the partition's fieldschema for schema evaluation, we should add another function/flag for that.

        These changes reduce memory by 50% and CPU by 20%.

        The review board is also updated with the Java-only patch.

        Show
        Ning Zhang added a comment - There are 2 major changes from the last patch: added a parameter hive.metastore.batch.retrieve.max to control the maximum number of partitions can be retrieved from the metastore in one batch (default 300). In Hive.getPartitionsByNames(), the input partition name list are separated into sublists and call the metastore API for each sublist. one of the most time consuming DB operations is the retrieve the sub-classes of MPartition. In particular the list of FieldSchema are retrieved for each partition and they are never used (the table's field schema is used for all partitions). So one of the changes here is to omit the retrieval of FieldSchema and make the table's fieldschema as the partitions. If later we need the partition's fieldschema for schema evaluation, we should add another function/flag for that. These changes reduce memory by 50% and CPU by 20%. The review board is also updated with the Java-only patch.
        Hide
        Namit Jain added a comment -

        Comments posted on the review board

        Show
        Namit Jain added a comment - Comments posted on the review board
        Hide
        Ning Zhang added a comment -

        Taken Namit's comment. Review board is also updated.

        Show
        Ning Zhang added a comment - Taken Namit's comment. Review board is also updated.
        Hide
        Namit Jain added a comment -

        +1

        Show
        Namit Jain added a comment - +1
        Hide
        Namit Jain added a comment -

        The test pcr.q is failing - can you take a look ?
        The results look wrong.

        Show
        Namit Jain added a comment - The test pcr.q is failing - can you take a look ? The results look wrong.
        Hide
        Ning Zhang added a comment -

        resolved pcr.q (only PartitionPruner.java is changed). I'm also running tests now.

        Show
        Ning Zhang added a comment - resolved pcr.q (only PartitionPruner.java is changed). I'm also running tests now.
        Hide
        Namit Jain added a comment -

        can you update review board also ?

        Show
        Namit Jain added a comment - can you update review board also ?
        Hide
        Ning Zhang added a comment -

        updated the review board. Also my tests passed.

        Show
        Ning Zhang added a comment - updated the review board. Also my tests passed.
        Hide
        Namit Jain added a comment -

        +1

        Show
        Namit Jain added a comment - +1
        Hide
        Namit Jain added a comment -

        Committed. Thanks Ning

        Show
        Namit Jain added a comment - Committed. Thanks Ning

          People

          • Assignee:
            Ning Zhang
            Reporter:
            Ning Zhang
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development