Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-8498

Insert into table misses some rows when vectorization is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.13.1, 0.14.0
    • 0.14.0
    • Vectorization

    Description

      Following is a small reproducible case for the issue

      create table orc1
      stored as orc
      tblproperties("orc.compress"="ZLIB")
      as
      select rn
      from
      (
      select cast(1 as int) as rn from src limit 1
      union all
      select cast(100 as int) as rn from src limit 1
      union all
      select cast(10000 as int) as rn from src limit 1
      ) t;

      create table orc_rn1 (rn int);
      create table orc_rn2 (rn int);
      create table orc_rn3 (rn int);

      // These inserts should produce 3 rows but only 1 row is produced
      from orc1 a
      insert overwrite table orc_rn1 select a.* where a.rn < 100
      insert overwrite table orc_rn2 select a.* where a.rn >= 100 and a.rn < 1000
      insert overwrite table orc_rn3 select a.* where a.rn >= 1000;

      select * from orc_rn1
      union all
      select * from orc_rn2
      union all
      select * from orc_rn3;

      The expected output of the query is
      1
      100
      10000

      But with vectorization enabled we get
      1

      Attachments

        1. HIVE-8498.01.patch
          34 kB
          Matt McCline
        2. HIVE-8498.02.patch
          34 kB
          Matt McCline
        3. HIVE-8498.3.patch
          16 kB
          Jitendra Nath Pandey
        4. HIVE-8498.4.patch
          16 kB
          Jitendra Nath Pandey

        Issue Links

          Activity

            People

              jnp Jitendra Nath Pandey
              prasanth_j Prasanth Jayachandran
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: