Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7045

Wrong results in multi-table insert aggregating without group by clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.10.0, 0.12.0
    • 0.14.0
    • None
    • None

    Description

      This happens whenever there are more than 1 reducers.

      The scenario :

      CREATE TABLE t1 (a int, b int);
      CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string);

      insert into table t1 select 1,1 from asd limit 1;
      insert into table t1 select 2,2 from asd limit 1;

      t1 contains :
      1 1
      2 2

      from t1
      insert overwrite table t2 partition(var_name='a') select count(a) cnt
      insert overwrite table t2 partition(var_name='b') select count(b) cnt ;

      select * from t2;
      returns :
      2 a
      2 b

      as expected.

      Setting the number of reducers higher than 1 :

      set mapred.reduce.tasks=2;

      from t1
      insert overwrite table t2 partition(var_name='a') select count(a) cnt
      insert overwrite table t2 partition(var_name='b') select count(b) cnt;

      select * from t2;
      1 a
      1 a
      1 b
      1 b

      Wrong results.

      This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly.

      adding "group by 1" in the end of each insert solves the problem :

      from t1
      insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1
      insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1;

      generates :
      2 a
      2 b

      This should work without the group by...
      The number of rows for each partition will be the amount of reducers.
      Each reducer calculated a sub total of the count.

      Attachments

        1. HIVE-7045.1.patch.txt
          1 kB
          Navis Ryu

        Issue Links

          Activity

            People

              navis Navis Ryu
              dimamah dima machlin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: