Hive
  1. Hive
  2. HIVE-3433

Implement CUBE and ROLLUP operators in Hive

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.10.0
    • Component/s: Query Processor
    • Labels:
      None
    1. hive-3433.6.patch
      135 kB
      Ivan Gorbachev
    2. hive-3433.5.patch
      136 kB
      Ivan Gorbachev
    3. hive-3433.4.patch
      152 kB
      Ivan Gorbachev
    4. hive.3433.3.patch
      104 kB
      Namit Jain
    5. hive.3433.2.patch
      101 kB
      Namit Jain
    6. hive.3433.1.patch
      98 kB
      Namit Jain

      Issue Links

        Activity

        Hide
        Sambavi Muthukrishnan added a comment -

        Provide an efficient implementation of CUBE and ROLLUP in Hive. We can use a group by followed by adding rows and re-grouping to provide the CUBE/ROLLUP.

        Show
        Sambavi Muthukrishnan added a comment - Provide an efficient implementation of CUBE and ROLLUP in Hive. We can use a group by followed by adding rows and re-grouping to provide the CUBE/ROLLUP.
        Show
        Namit Jain added a comment - https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup
        Show
        Namit Jain added a comment - https://reviews.facebook.net/D5619
        Hide
        Namit Jain added a comment -

        All the tests passed.

        The current implementation can be optimized.
        In this patch, at the time of aggregation, all the values corresponding to the grouping sets are passed.
        This will increase the data across the map-reduce boundary.
        It is still better than the current work-around for the cube and rollup in hive, which is to perform
        multiple group bys for the same base table.

        Show
        Namit Jain added a comment - All the tests passed. The current implementation can be optimized. In this patch, at the time of aggregation, all the values corresponding to the grouping sets are passed. This will increase the data across the map-reduce boundary. It is still better than the current work-around for the cube and rollup in hive, which is to perform multiple group bys for the same base table.
        Hide
        Namit Jain added a comment -

        Optimized the patch.
        This is ready for review.
        If a rollup/cube is being performed, hash aggregation is always used.
        The possible keys are computed, and the group by operator behaves as if
        it received multiple keys (one per grouping set) from the upstream operator.

        Show
        Namit Jain added a comment - Optimized the patch. This is ready for review. If a rollup/cube is being performed, hash aggregation is always used. The possible keys are computed, and the group by operator behaves as if it received multiple keys (one per grouping set) from the upstream operator.
        Hide
        Shreepadma Venugopalan added a comment -

        @Namit: I left comments on phabricator.Thanks.

        Show
        Shreepadma Venugopalan added a comment - @Namit: I left comments on phabricator.Thanks.
        Hide
        Carl Steinbach added a comment -

        @Namit: I left some comments on phabricator. Thanks.

        Show
        Carl Steinbach added a comment - @Namit: I left some comments on phabricator. Thanks.
        Hide
        Namit Jain added a comment -

        I will respond to the comments and upload a new patch.
        Thanks for taking a look.

        Show
        Namit Jain added a comment - I will respond to the comments and upload a new patch. Thanks for taking a look.
        Hide
        Namit Jain added a comment -

        addressed comments

        Show
        Namit Jain added a comment - addressed comments
        Hide
        Namit Jain added a comment -

        Shreepadma, I saw your ivy.xml changes in https://reviews.apache.org/r/6878/diff/?page=1.
        I can do the conversion of bitset to fastbitset once your jira is in.

        Show
        Namit Jain added a comment - Shreepadma, I saw your ivy.xml changes in https://reviews.apache.org/r/6878/diff/?page=1 . I can do the conversion of bitset to fastbitset once your jira is in.
        Hide
        Namit Jain added a comment -

        Shreepadma Venugopalan, thanks - using fastbitset instead.

        Show
        Namit Jain added a comment - Shreepadma Venugopalan , thanks - using fastbitset instead.
        Hide
        Shreepadma Venugopalan added a comment -

        Thanks for making the change, Namit.

        Show
        Shreepadma Venugopalan added a comment - Thanks for making the change, Namit.
        Hide
        Namit Jain added a comment -

        The tests finished successfully

        Show
        Namit Jain added a comment - The tests finished successfully
        Hide
        Namit Jain added a comment -

        addressed all comments

        Show
        Namit Jain added a comment - addressed all comments
        Hide
        Namit Jain added a comment -

        Kevin Wilfong, addressed comments

        Show
        Namit Jain added a comment - Kevin Wilfong , addressed comments
        Hide
        Kevin Wilfong added a comment -

        A few more comments, fairly small.

        Show
        Kevin Wilfong added a comment - A few more comments, fairly small.
        Hide
        Ivan Gorbachev added a comment -
        Show
        Ivan Gorbachev added a comment - Fixed comments - https://reviews.facebook.net/D6141
        Hide
        Kevin Wilfong added a comment -

        +1 Looks good.

        Show
        Kevin Wilfong added a comment - +1 Looks good.
        Hide
        Kevin Wilfong added a comment -

        Committed, thanks Namit and Ivan.

        Show
        Kevin Wilfong added a comment - Committed, thanks Namit and Ivan.
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-h0.21 #1758 (See https://builds.apache.org/job/Hive-trunk-h0.21/1758/)
        HIVE-3433. Implement CUBE and ROLLUP operators in Hive. (Ivan Gorbachev and Namit Jain via kevinwilfong) (Revision 1402245)

        Result = FAILURE
        kevinwilfong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1402245
        Files :

        • /hive/trunk/eclipse-templates/.classpath
        • /hive/trunk/ivy/libraries.properties
        • /hive/trunk/ql/build.xml
        • /hive/trunk/ql/ivy.xml
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube1.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube2.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup1.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup2.q
        • /hive/trunk/ql/src/test/queries/clientpositive/groupby_cube1.q
        • /hive/trunk/ql/src/test/queries/clientpositive/groupby_rollup1.q
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_cube1.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_cube2.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup1.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup2.q.out
        • /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out
        • /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby1.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby2.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby3.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby4.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby5.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby6.q.xml
        Show
        Hudson added a comment - Integrated in Hive-trunk-h0.21 #1758 (See https://builds.apache.org/job/Hive-trunk-h0.21/1758/ ) HIVE-3433 . Implement CUBE and ROLLUP operators in Hive. (Ivan Gorbachev and Namit Jain via kevinwilfong) (Revision 1402245) Result = FAILURE kevinwilfong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1402245 Files : /hive/trunk/eclipse-templates/.classpath /hive/trunk/ivy/libraries.properties /hive/trunk/ql/build.xml /hive/trunk/ql/ivy.xml /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube1.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube2.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup1.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup2.q /hive/trunk/ql/src/test/queries/clientpositive/groupby_cube1.q /hive/trunk/ql/src/test/queries/clientpositive/groupby_rollup1.q /hive/trunk/ql/src/test/results/clientnegative/groupby_cube1.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_cube2.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup1.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup2.q.out /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out /hive/trunk/ql/src/test/results/compiler/plan/groupby1.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby2.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby3.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby4.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby5.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby6.q.xml
        Hide
        Hudson added a comment -

        Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/)
        HIVE-3433. Implement CUBE and ROLLUP operators in Hive. (Ivan Gorbachev and Namit Jain via kevinwilfong) (Revision 1402245)

        Result = ABORTED
        kevinwilfong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1402245
        Files :

        • /hive/trunk/eclipse-templates/.classpath
        • /hive/trunk/ivy/libraries.properties
        • /hive/trunk/ql/build.xml
        • /hive/trunk/ql/ivy.xml
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
        • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube1.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube2.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup1.q
        • /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup2.q
        • /hive/trunk/ql/src/test/queries/clientpositive/groupby_cube1.q
        • /hive/trunk/ql/src/test/queries/clientpositive/groupby_rollup1.q
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_cube1.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_cube2.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup1.q.out
        • /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup2.q.out
        • /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out
        • /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby1.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby2.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby3.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby4.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby5.q.xml
        • /hive/trunk/ql/src/test/results/compiler/plan/groupby6.q.xml
        Show
        Hudson added a comment - Integrated in Hive-trunk-hadoop2 #54 (See https://builds.apache.org/job/Hive-trunk-hadoop2/54/ ) HIVE-3433 . Implement CUBE and ROLLUP operators in Hive. (Ivan Gorbachev and Namit Jain via kevinwilfong) (Revision 1402245) Result = ABORTED kevinwilfong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1402245 Files : /hive/trunk/eclipse-templates/.classpath /hive/trunk/ivy/libraries.properties /hive/trunk/ql/build.xml /hive/trunk/ql/ivy.xml /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube1.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_cube2.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup1.q /hive/trunk/ql/src/test/queries/clientnegative/groupby_rollup2.q /hive/trunk/ql/src/test/queries/clientpositive/groupby_cube1.q /hive/trunk/ql/src/test/queries/clientpositive/groupby_rollup1.q /hive/trunk/ql/src/test/results/clientnegative/groupby_cube1.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_cube2.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup1.q.out /hive/trunk/ql/src/test/results/clientnegative/groupby_rollup2.q.out /hive/trunk/ql/src/test/results/clientpositive/groupby_cube1.q.out /hive/trunk/ql/src/test/results/clientpositive/groupby_rollup1.q.out /hive/trunk/ql/src/test/results/compiler/plan/groupby1.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby2.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby3.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby4.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby5.q.xml /hive/trunk/ql/src/test/results/compiler/plan/groupby6.q.xml
        Hide
        Ashutosh Chauhan added a comment -

        This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

        Show
        Ashutosh Chauhan added a comment - This issue is fixed and released as part of 0.10.0 release. If you find an issue which seems to be related to this one, please create a new jira and link this one with new jira.

          People

          • Assignee:
            Namit Jain
            Reporter:
            Sambavi Muthukrishnan
          • Votes:
            1 Vote for this issue
            Watchers:
            16 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development