Hive
  1. Hive
  2. HIVE-931

Optimize GROUP BY aggregations where key is a sorted/bucketed column

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If the table is sorted by a given key, we don't use that for group by. That can be very useful.

      For eg: if T is sorted by column c1,

      For select c1, aggr() from T group by c1
      we always use a single map-reduce job. No hash table is needed on the mapper, since the data is sorted by c1 anyway.

      This will reduce the memory pressure on the mapper and also remove overhead of maintaining the hash table.

      1. hive-931-2009-11-18.patch
        310 kB
        He Yongqiang
      2. hive-931-2009-11-19.patch
        326 kB
        He Yongqiang
      3. hive-931-2009-11-20.3.patch
        416 kB
        He Yongqiang
      4. hive-931-2009-11-21.patch
        416 kB
        He Yongqiang
      5. hive-931-2009-12-01.patch
        438 kB
        He Yongqiang
      6. hive-931-2009-12-03.patch
        449 kB
        He Yongqiang

        Activity

        Hide
        He Yongqiang added a comment -

        hive-931-2009-11-18.patch adds two tests, one in positive tests, the other in negative tests.

        Show
        He Yongqiang added a comment - hive-931-2009-11-18.patch adds two tests, one in positive tests, the other in negative tests.
        Hide
        Namit Jain added a comment -

        Discussed with Yongqiang offline - this should be a optimization step instead of the current approach.
        Since, at that time, partition pruning has also been performed

        Show
        Namit Jain added a comment - Discussed with Yongqiang offline - this should be a optimization step instead of the current approach. Since, at that time, partition pruning has also been performed
        Hide
        He Yongqiang added a comment -

        Incorporates Namit's comments. Thanks, Namit!

        Show
        He Yongqiang added a comment - Incorporates Namit's comments. Thanks, Namit!
        Hide
        Namit Jain added a comment -

        I think: bucketCols should be same as groupbyCols. not a superset.

        Consider:

        select ... group by a,b:

        where data is bucketed by a,b,c.

        A mapper might get:

        a1 b1 c1
        a1 b2 c2
        a1 b1 c3

        in which case the current algorithm might not work.

        Also, change the name of the variable in isTableSortedbyColumns to bucketedCols instead of sortedCols

        Show
        Namit Jain added a comment - I think: bucketCols should be same as groupbyCols. not a superset. Consider: select ... group by a,b: where data is bucketed by a,b,c. A mapper might get: a1 b1 c1 a1 b2 c2 a1 b1 c3 in which case the current algorithm might not work. Also, change the name of the variable in isTableSortedbyColumns to bucketedCols instead of sortedCols
        Hide
        Namit Jain added a comment -

        1. Given the fact that partition pruning has already happened and stored in the parse context, can you use that information
        instead of calling PartitionPruner.prune() again?
        2. Instead of walking up the tree, can you collect the list of the tablescans before that group by ?
        3. Can you add some more comments in GroupByOptimizer ?
        4. I am not sure, but there seems to be a bug there:

        what about the case:

        (subq) followed by groupby,

        are you taking the base tables of the subquery which may be different ?

        Can you add tests for the above scenario ?

        Show
        Namit Jain added a comment - 1. Given the fact that partition pruning has already happened and stored in the parse context, can you use that information instead of calling PartitionPruner.prune() again? 2. Instead of walking up the tree, can you collect the list of the tablescans before that group by ? 3. Can you add some more comments in GroupByOptimizer ? 4. I am not sure, but there seems to be a bug there: what about the case: (subq) followed by groupby, are you taking the base tables of the subquery which may be different ? Can you add tests for the above scenario ?
        Hide
        He Yongqiang added a comment -

        Thanks for the detailed comments, Namit!

        I think: bucketCols should be same as groupbyCols. not a superset.

        done. Changed all the occurrence of sortedGroupby to bucketGroupby to avoid confusion. (At first we think we need to do sorted groupby, but more accurate what we did is bucket groupby)

        Also, change the name of the variable in isTableSortedbyColumns to bucketedCols instead of sortedCols

        done.

        1. Given the fact that partition pruning has already happened and stored in the parse context, can you use that information

        instead of calling PartitionPruner.prune() again?
        done. Actually partition pruning does not perform the actual prunning job at optimize phase. hive-931-2009-11-20.3.patch added a field in ParseContext to reuse results of PartitionPrunner.

        Instead of walking up the tree, can you collect the list of the tablescans before that group by ?

        done.

        Also added a testcase for subquery

        Show
        He Yongqiang added a comment - Thanks for the detailed comments, Namit! I think: bucketCols should be same as groupbyCols. not a superset. done. Changed all the occurrence of sortedGroupby to bucketGroupby to avoid confusion. (At first we think we need to do sorted groupby, but more accurate what we did is bucket groupby) Also, change the name of the variable in isTableSortedbyColumns to bucketedCols instead of sortedCols done. 1. Given the fact that partition pruning has already happened and stored in the parse context, can you use that information instead of calling PartitionPruner.prune() again? done. Actually partition pruning does not perform the actual prunning job at optimize phase. hive-931-2009-11-20.3.patch added a field in ParseContext to reuse results of PartitionPrunner. Instead of walking up the tree, can you collect the list of the tablescans before that group by ? done. Also added a testcase for subquery
        Hide
        Namit Jain added a comment -

        1. Add new parameter in hive-default.xml
        2. Utilities.java: change function names - extractColumnNamesFromSortCols
        variable name: bucketCol: line 813
        3. Remove all the tabs
        4. Given the fact that we are not doing this optimization across sub-queries right now,
        would it be simpler to maintain the group by operator to table mapping via a separate walker
        instead of getting while generating the group by operator ? – I am fine with the current approach
        also, but just a question.
        5. You are still doing partition pruning in GroupByOptimizer - why cant we reuse the mapping from
        ParseContext. That was the whole reason for storing it in ParseContext.

        Sorry about being so picky...

        Show
        Namit Jain added a comment - 1. Add new parameter in hive-default.xml 2. Utilities.java: change function names - extractColumnNamesFromSortCols variable name: bucketCol: line 813 3. Remove all the tabs 4. Given the fact that we are not doing this optimization across sub-queries right now, would it be simpler to maintain the group by operator to table mapping via a separate walker instead of getting while generating the group by operator ? – I am fine with the current approach also, but just a question. 5. You are still doing partition pruning in GroupByOptimizer - why cant we reuse the mapping from ParseContext. That was the whole reason for storing it in ParseContext. Sorry about being so picky...
        Hide
        He Yongqiang added a comment -

        Updated the patch. Thanks Namit!

        5. You are still doing partition pruning in GroupByOptimizer - why cant we reuse the mapping from
        ParseContext. That was the whole reason for storing it in ParseContext.

        The mapping in ParseContext is null when reach GroupByOptimizer, this is because PartitionPruner does not obtain PartitionList at optimize phase, instead it only obtains partition predicates.

        Show
        He Yongqiang added a comment - Updated the patch. Thanks Namit! 5. You are still doing partition pruning in GroupByOptimizer - why cant we reuse the mapping from ParseContext. That was the whole reason for storing it in ParseContext. The mapping in ParseContext is null when reach GroupByOptimizer, this is because PartitionPruner does not obtain PartitionList at optimize phase, instead it only obtains partition predicates.
        Hide
        He Yongqiang added a comment -

        The mapping added in ParseContext will be updated by GroupByOptimizer, and reused when generating mr tasks.

        Show
        He Yongqiang added a comment - The mapping added in ParseContext will be updated by GroupByOptimizer, and reused when generating mr tasks.
        Hide
        He Yongqiang added a comment -

        updated the patch. And put sort columns into consideration.

        • We use bucket columns only when the sorted column set is empty or the
        • sorted column set is an exact prefix match of bucket columns. For example, A
        • table is bucketed by column a,b, and c, and a query wants to group by
        • a,b,c. If the table's sort column is null, or is [a],[a,b], or [a,b,c],
        • we can use the 'sorted groupby' by looking at the bucket columns .
        • If we can can not determine by looking at bucketed columns and the table
        • has sort columns, we resort to sort columns. We can use bucket group by
        • if the groupby column set is an exact prefix match of sort columns.
        Show
        He Yongqiang added a comment - updated the patch. And put sort columns into consideration. We use bucket columns only when the sorted column set is empty or the sorted column set is an exact prefix match of bucket columns. For example, A table is bucketed by column a,b, and c, and a query wants to group by a,b,c. If the table's sort column is null, or is [a] , [a,b] , or [a,b,c] , we can use the 'sorted groupby' by looking at the bucket columns . If we can can not determine by looking at bucketed columns and the table has sort columns, we resort to sort columns. We can use bucket group by if the groupby column set is an exact prefix match of sort columns.
        Hide
        Namit Jain added a comment -

        } else if (node instanceof exprNodeGenericFuncDesc) {
        158 exprNodeGenericFuncDesc udfNode = ((exprNodeGenericFuncDesc)node);
        159 GenericUDF udf = udfNode.getGenericUDF();
        160 if(!FunctionRegistry.isDeterministic(udf))
        161 return;
        162 groupByKeys.addAll(0, udfNode.getChildExprs());

        Isnt there a bug here ?

        group by foo(foo2)

        You cant assume that the first child is a column - you should recurse till you get a column

        Show
        Namit Jain added a comment - } else if (node instanceof exprNodeGenericFuncDesc) { 158 exprNodeGenericFuncDesc udfNode = ((exprNodeGenericFuncDesc)node); 159 GenericUDF udf = udfNode.getGenericUDF(); 160 if(!FunctionRegistry.isDeterministic(udf)) 161 return; 162 groupByKeys.addAll(0, udfNode.getChildExprs()); Isnt there a bug here ? group by foo(foo2 ) You cant assume that the first child is a column - you should recurse till you get a column
        Hide
        Namit Jain added a comment -

        Also can you add a test for the same.

        Show
        Namit Jain added a comment - Also can you add a test for the same.
        Hide
        He Yongqiang added a comment -

        Isnt there a bug here ?
        group by foo(foo2)
        You cant assume that the first child is a column - you should recurse till you get a column

        Thanks, Namit. I think i did not assume that the first child is a column, i just inserted all children exprs into the first place of groupByKeys list, and recurse.

        Will add a testcase for this.

        Show
        He Yongqiang added a comment - Isnt there a bug here ? group by foo(foo2 ) You cant assume that the first child is a column - you should recurse till you get a column Thanks, Namit. I think i did not assume that the first child is a column, i just inserted all children exprs into the first place of groupByKeys list, and recurse. Will add a testcase for this.
        Hide
        He Yongqiang added a comment -

        Attached a new patch. Had a lot of offline discussions with Namit. Thanks Namit!

        Finally, we changed to rule to,
        we will transform a group by to a sort based group by when

        1) If a table's sort columns are empty, and buckets columns contains and only contains all group by columns (order does not matter).

        or

        2) If a table's sort columns are not empty, group by columns are a prefix subset of sort columns.
        For example, if sorted by a,b,c, group by
        a,
        a,b
        b,a
        a,b,c
        b,a,c ..
        are all ok.

        Show
        He Yongqiang added a comment - Attached a new patch. Had a lot of offline discussions with Namit. Thanks Namit! Finally, we changed to rule to, we will transform a group by to a sort based group by when 1) If a table's sort columns are empty, and buckets columns contains and only contains all group by columns (order does not matter). or 2) If a table's sort columns are not empty, group by columns are a prefix subset of sort columns. For example, if sorted by a,b,c, group by a, a,b b,a a,b,c b,a,c .. are all ok.
        Hide
        Namit Jain added a comment -

        +1

        looks good - will commit if the tests pass after https://issues.apache.org/jira/browse/HIVE-549

        Show
        Namit Jain added a comment - +1 looks good - will commit if the tests pass after https://issues.apache.org/jira/browse/HIVE-549
        Hide
        Namit Jain added a comment -

        Committed. Thanks Yongqiang

        Show
        Namit Jain added a comment - Committed. Thanks Yongqiang
        Hide
        Zheng Shao added a comment -

        Is the "sorted by" property always the same as "bucketed by" for a table?

        The name "hive.optimize.groupby" is a bit too general but I guess it's OK for now. Can we explain what is "bucketed group by" in the hive-default.xml? Users probably won't understand what it is.

        Show
        Zheng Shao added a comment - Is the "sorted by" property always the same as "bucketed by" for a table? The name "hive.optimize.groupby" is a bit too general but I guess it's OK for now. Can we explain what is "bucketed group by" in the hive-default.xml? Users probably won't understand what it is.
        Hide
        He Yongqiang added a comment -

        Hi Zheng,

        Is the "sorted by" property always the same as "bucketed by" for a table?

        They are not the same. And usually 'sorted by' is empty.

        Can we explain what is "bucketed group by" in the hive-default.xml? Users probably won't understand what it is.

        Yes, i should add more explanation for this. I am ok to add it in a new jira, or update it on wiki, or both.

        Show
        He Yongqiang added a comment - Hi Zheng, Is the "sorted by" property always the same as "bucketed by" for a table? They are not the same. And usually 'sorted by' is empty. Can we explain what is "bucketed group by" in the hive-default.xml? Users probably won't understand what it is. Yes, i should add more explanation for this. I am ok to add it in a new jira, or update it on wiki, or both.

          People

          • Assignee:
            He Yongqiang
            Reporter:
            Namit Jain
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development