Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-22513

Constant propagation of casted column in filter ops can cause incorrect results



    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0-alpha-1
    • Query Planning
    • None


      This issue happens if CBO is disabled.

      We should not be propagating constants if the corresponding ExprNodeColumnDesc instance is wrapped inside a CAST operator as casting might truncate information from the original column.

      This can happen if we're using CAST in a WHERE clause, which will cause the projected columns to be replaced in a SELECT operator. Their new value will be the result of casting which could be a different value compared to that in the original column:

      set hive.cbo.enable=false;
      set hive.fetch.task.conversion=more; --just for testing convenience
      create table testtb (id string);
      insert into testtb values('2019-11-05 01:01:11');
      select id, CAST(id AS VARCHAR(10)) from testtb where CAST(id AS VARCHAR(9)) = '2019-11-0';
      |     id     |    _c1     |
      | 2019-11-0  | 2019-11-0  |
      1 row selected (0.168 seconds)
      -- VS expected: 2019-11-05 01:01:11 | 2019-11-05 

      As to what types of casting (from and where types) cause information loss it's hard to properly keep track of, and I don't think it should be taken into consideration when deciding whether or not to propagate a constant. Rather than adding a big and potentially convoluted and fragile check for this, I propose to prevent constant mappings to be spawned out of CASTed columns.


        1. HIVE-22513.1.patch
          13 kB
          Ádám Szita
        2. HIVE-22513.0.patch
          1 kB
          Ádám Szita



            szita Ádám Szita
            szita Ádám Szita
            0 Vote for this issue
            3 Start watching this issue