Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-10062

HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 1.2.0
    • None
    • None

    Description

      In q.test environment with src table, execute the following query:

      CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
      
      CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
      
      FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
                               UNION all 
            select s2.key as key, s2.value as value from src s2) unionsrc
      INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
      INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
      GROUP BY unionsrc.key, unionsrc.value;
      
      select * from DEST1;
      select * from DEST2;
      

      DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row "tst1 500 1"

      Attachments

        1. HIVE-10062.01.patch
          38 kB
          Pengcheng Xiong
        2. HIVE-10062.02.patch
          126 kB
          Pengcheng Xiong
        3. HIVE-10062.03.patch
          127 kB
          Pengcheng Xiong
        4. HIVE-10062.04.patch
          127 kB
          Pengcheng Xiong
        5. HIVE-10062.05.patch
          128 kB
          Pengcheng Xiong
        6. HIVE-10062.branch-1.patch
          105 kB
          Pengcheng Xiong

        Issue Links

          Activity

            People

              pxiong Pengcheng Xiong
              pxiong Pengcheng Xiong
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: