Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25861

When ConstantPropagate optimizer optimizes case when equals case when twice, got wrong logical execution plan

    XMLWordPrintableJSON

Details

    Description

      when run the following sql:

      select
          t1.column_1,
          t2.column_1,
          t1.column_2,
          t1.column_3,
          case 
              when (
                  case 
                      when t1.column_1 in (310000, 320000, 330000, 340000) 
                      then 310000
                      else t1.column_1
                  end
              ) = (
                  case
                      when t2.column_1 in (310000, 320000, 330000, 340000) 
                      then 310000
                      else t2.column_1
                  end
              )
              then t1.column_2
              else t1.column_3
          end as result
      from
          dim.dim_xmf_center t1
          left join dim.dim_xmf_center t2
      where
          t1.mt = '202201';
      

      t1.column_1 is 440000 and t2.column_1 is 440000 but the result is t1.column_3

      Please see picture 1.png in the attachment for the result

      I found that the case when part of the execution plan is wrong:

      CASE WHEN (CASE WHEN ((_col20) IN (310000, 320000, 330000, 340000)) THEN (CASE WHEN ((_col46) IN (310000, 320000, 330000, 340000)) THEN ((true = _col20)) ELSE (((_col46 = 310000) = _col20)) END) ELSE (CASE WHEN ((_col46) IN (310000, 320000, 330000, 340000)) THEN ((true = _col20)) ELSE (((_col46 = 310000) = _col20)) END) END) THEN (_col12) ELSE (_col15) END
      

      Attachments

        1. 1.png
          11 kB
          Jun Di
        2. 2.png
          84 kB
          Jun Di

        Issue Links

          Activity

            People

              Sirius Jun Di
              Sirius Jun Di
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h