Hive
  1. Hive
  2. HIVE-222

Group by on a combination of disitinct and non distinct aggregates can return serialization errors with map side aggregations.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.3.0
    • Component/s: Query Processor
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      HIVE-222. Fixed Group by on a combination of disitinct and non distinct aggregates. (Ashish Thusoo via zshao)
      Show
      HIVE-222 . Fixed Group by on a combination of disitinct and non distinct aggregates. (Ashish Thusoo via zshao)

      Description

      For queries of the form (groupby2_map.q in the source)

      SELECT x, count(DISTINCT y), SUM FROM t GROUP BY x

      when map side aggregation is on

      hive.map.aggr=true (This is off by default)

      The following exception can occur:
      [junit] Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Double
      [junit] at org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeTypeDouble.serialize(DynamicSerDeTypeDouble.java:60)
      [junit] at org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeFieldList.serialize(DynamicSerDeFieldList.java:235)
      [junit] at org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDeStructBase.serialize(DynamicSerDeStructBase.java:81)
      [junit] at org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.serialize(DynamicSerDe.java:174)

      1. patch-222.txt
        5 kB
        Ashish Thusoo

        Issue Links

          Activity

          Hide
          Ashish Thusoo added a comment -

          Fix for the bug.

          There was a bug in way the the aggregation list was being generated for the map side aggregation. As a result the ordering of the aggregations in the map side groupby operator and the reduce side groupby operator would differ leading to this problem. Ideally, we should be using the row schema information to generate the order but that needs a much larger refactor of how we generate plans in the group by case. For now this patch should fix the problem.

          There are prexisting tests that test this (groupby2_map.q and groupby3_map.q). The test case however relies on an internal hashmap giving the keys in a certain order. The bug was easily reproducible with the patch in HIVE-179. I have tested it with that patch.

          Show
          Ashish Thusoo added a comment - Fix for the bug. There was a bug in way the the aggregation list was being generated for the map side aggregation. As a result the ordering of the aggregations in the map side groupby operator and the reduce side groupby operator would differ leading to this problem. Ideally, we should be using the row schema information to generate the order but that needs a much larger refactor of how we generate plans in the group by case. For now this patch should fix the problem. There are prexisting tests that test this (groupby2_map.q and groupby3_map.q). The test case however relies on an internal hashmap giving the keys in a certain order. The bug was easily reproducible with the patch in HIVE-179 . I have tested it with that patch.
          Hide
          Ashish Thusoo added a comment -

          submitting patch.

          Show
          Ashish Thusoo added a comment - submitting patch.
          Hide
          Prasad Chakka added a comment -

          looks good +1

          though it doesn't guarantee that HashMap will return objects in the same order in both the functions.

          Show
          Prasad Chakka added a comment - looks good +1 though it doesn't guarantee that HashMap will return objects in the same order in both the functions.
          Hide
          Zheng Shao added a comment -

          Committed revision 734008. Thanks Ashish!

          Show
          Zheng Shao added a comment - Committed revision 734008. Thanks Ashish!
          Hide
          David Phillips added a comment -

          Is there any value to adding the testcase from HIVE-215?

          Show
          David Phillips added a comment - Is there any value to adding the testcase from HIVE-215 ?

            People

            • Assignee:
              Ashish Thusoo
              Reporter:
              Ashish Thusoo
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development