Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2332

Drill should be consistent with Implicit casting rules across data formats

    XMLWordPrintableJSON

Details

    Description

      Currently, the outcome of a query with a filter on a column comparing it with a literal, depends on the underlying data format.

      Parquet

      select * from date_dim where d_month_seq ='1193' limit 1;
      [Succeeds]
      
      select * from date_dim where d_date in ('1999-06-30') limit 1;
      [Succeeds]
      

      View on top of text:

      select * from date_dim where d_date in ('1999-06-30') limit 1;
      Query failed: SqlValidatorException: Values passed to IN operator must have compatible types
      
      Error: exception while executing query: Failure while executing query. (state=,code=0)
      
      select * from date_dim where d_month_seq ='1193' limit 1;
      Query failed: SqlValidatorException: Cannot apply '=' to arguments of type '<INTEGER> = <CHAR(4)>'. Supported form(s): '<COMPARABLE_TYPE> = <COMPARABLE_TYPE>'
      
      Error: exception while executing query: Failure while executing query. (state=,code=0)
      

      I understand that in the case of View on Text, SQL validation fails at the Optiq layer.

      But from the perspective of an end-user, Drill's behavior must be consistent across data formats. Also having a view by definition should abstract out this information.

      Here, both the view and parquet were created with type information.

      Parquet-meta

      parquet-schema /mapr/abhi311/data/parquet/tpcds/scale1/date_dim/0_0_0.parquet 
      message root {
        optional int32 d_date_sk;
        optional binary d_date_id (UTF8);
        optional binary d_date (UTF8);
        optional int32 d_month_seq;
        optional int32 d_week_seq;
        optional int32 d_quarter_seq;
        optional int32 d_year;
        optional int32 d_dow;
        optional int32 d_moy;
        optional int32 d_dom;
        optional int32 d_qoy;
        optional int32 d_fy_year;
        optional int32 d_fy_quarter_seq;
        optional int32 s_fy_week_seq;
        optional binary d_day_name (UTF8);
        optional binary d_quarter_name (UTF8);
        optional binary d_holiday (UTF8);
        optional binary d_weekend (UTF8);
        optional binary d_following_holiday (UTF8);
        optional int32 d_first_dom;
        optional int32 d_last_dom;
        optional int32 d_same_day_ly;
        optional int32 d_same_day_lq;
        optional binary d_current_day (UTF8);
        optional binary d_current_week (UTF8);
        optional binary d_current_month (UTF8);
        optional binary d_current_quarter (UTF8);
        optional binary d_current_year (UTF8);
      }
      

      Describe View

      > describe date_dim;
      +-------------+------------+-------------+
      | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
      +-------------+------------+-------------+
      | d_date_sk   | INTEGER    | NO          |
      | d_date_id   | VARCHAR    | NO          |
      | d_date      | DATE       | NO          |
      | d_month_seq | INTEGER    | NO          |
      | d_week_seq  | INTEGER    | NO          |
      | d_quarter_seq | INTEGER    | NO          |
      | d_year      | INTEGER    | NO          |
      | d_dow       | INTEGER    | NO          |
      | d_moy       | INTEGER    | NO          |
      | d_dom       | INTEGER    | NO          |
      | d_qoy       | INTEGER    | NO          |
      | d_fy_year   | INTEGER    | NO          |
      | d_fy_quarter_seq | INTEGER    | NO          |
      | s_fy_week_seq | INTEGER    | NO          |
      | d_day_name  | VARCHAR    | NO          |
      | d_quarter_name | VARCHAR    | NO          |
      | d_holiday   | VARCHAR    | NO          |
      | d_weekend   | VARCHAR    | NO          |
      | d_following_holiday | VARCHAR    | NO          |
      | d_first_dom | INTEGER    | NO          |
      | d_last_dom  | INTEGER    | NO          |
      | d_same_day_ly | INTEGER    | NO          |
      | d_same_day_lq | INTEGER    | NO          |
      | d_current_day | VARCHAR    | NO          |
      | d_current_week | VARCHAR    | NO          |
      | d_current_month | VARCHAR    | NO          |
      | d_current_quarter | VARCHAR    | NO          |
      | d_current_year | VARCHAR    | NO          |
      +-------------+------------+-------------+
      28 rows selected (0.137 seconds)
      

      For an end

      Attachments

        Issue Links

          Activity

            People

              seanhychu Sean Hsuan-Yi Chu
              agirish Abhishek Girish
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: