Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26673

Incorrect row count when vectorisation is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0-alpha-2
    • Not Applicable
    • Hive
    • None

    Description

      Repro:

      select count(*) from
      (SELECT T0.plant_no,
      T0.part_chain,
      T0.part_new,
      T0.part_no
      FROM dm_ads_dims_prod.cloudera_test3 T0
      LEFT JOIN
      (SELECT T0.plant_no,
      T0.part_chain
      FROM
      (SELECT T0.plant_no,
      T0.part_chain,
      count( *) AS ct
      FROM dm_ads_dims_prod.cloudera_test3 T0
      WHERE purchase_pos = pos
      GROUP BY T0.plant_no,
      T0.part_chain) T0
      WHERE ct = 2 ) T1 ON T0.plant_no = T1.plant_no
      AND T0.part_chain = T1.part_chain
      WHERE T0.purchase_pos = T0.pos
      AND (T1.part_chain IS NULL
      OR (T1.part_chain IS NOT NULL
      AND T0.fd = 1)) ) s;
      

      Run the query with the following settings on the repro cluster a few times

      set hive.query.results.cache.enabled=false;
      set hive.compute.query.using.stats=false;
      set hive.auto.convert.join=true;
      

      and the results was

      2682424
      2682426
      2682425

       

      Then turn off hive.auto.convert.join

      set hive.query.results.cache.enabled=false;
      set hive.compute.query.using.stats=false;
      set hive.auto.convert.join=false;
      

      and the result was always 2682420

      Analyzing the plans with hive.auto.convert.join enabled vs disabled, the difference is the type of join Map vs Merge.

      Additionally, vectorization also plays a role when turned off the result became good:

      SET hive.vectorized.execution.enabled=false;
      

      It is also just a workaround and has negative impact on performance this should help us narrow down where to find the cause of the issue.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              simhadri-g Simhadri Govindappa
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: