Hive
  1. Hive
  2. HIVE-2621

Allow multiple group bys with the same input data and spray keys to be run on the same reducer.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, when a user runs a query, such as a multi-insert, where each insertion subclause consists of a simple query followed by a group by, the group bys for each clause are run on a separate reducer. This requires writing the data for each group by clause to an intermediate file, and then reading it back. This uses a significant amount of the total CPU consumed by the query for an otherwise simple query.

      If the subclauses are grouped by their distinct expressions and group by keys, with all of the group by expressions for a group of subclauses run on a single reducer, this would reduce the amount of reading/writing to intermediate files for some queries.

      To do this, for each group of subclauses, in the mapper we would execute a the filters for each subclause 'or'd together (provided each subclause has a filter) followed by a reduce sink. In the reducer, the child operators would be each subclauses filter followed by the group by and any subsequent operations.

      Note that this would require turning off map aggregation, so we would need to make using this type of plan configurable.

      1. ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.1.patch
        163 kB
        Phabricator
      2. ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.2.patch
        162 kB
        Phabricator
      3. ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.3.patch
        151 kB
        Phabricator
      4. ASF.LICENSE.NOT.GRANTED--HIVE-2621.D567.4.patch
        457 kB
        Phabricator
      5. HIVE-2621.1.patch.txt
        163 kB
        Kevin Wilfong

        Issue Links

          Activity

          Lefty Leverenz made changes -
          Link This issue relates to HIVE-2056 [ HIVE-2056 ]
          Ashutosh Chauhan made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Carl Steinbach made changes -
          Fix Version/s 0.9.0 [ 12317742 ]
          He Yongqiang made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Phabricator made changes -
          Attachment HIVE-2621.D567.4.patch [ 12508481 ]
          Phabricator made changes -
          Attachment HIVE-2621.D567.3.patch [ 12508334 ]
          Phabricator made changes -
          Attachment HIVE-2621.D567.2.patch [ 12507611 ]
          Kevin Wilfong made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Phabricator made changes -
          Attachment HIVE-2621.D567.1.patch [ 12505843 ]
          Kevin Wilfong made changes -
          Field Original Value New Value
          Attachment HIVE-2621.1.patch.txt [ 12505842 ]
          Kevin Wilfong created issue -

            People

            • Assignee:
              Kevin Wilfong
              Reporter:
              Kevin Wilfong
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development