Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-13039

BETWEEN predicate is not functioning correctly with predicate pushdown on Parquet table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.2.1, 2.0.0
    • 1.3.0, 2.1.0
    • Physical Optimizer
    • None

    Description

      BETWEEN becomes exclusive in parquet table when predicate pushdown is on (as it is by default in newer Hive versions). To reproduce(in a cluster, not local setup):
      CREATE TABLE parquet_tbl(
      key int,
      ldate string)
      PARTITIONED BY (
      lyear string )
      ROW FORMAT SERDE
      'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
      STORED AS INPUTFORMAT
      'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
      OUTPUTFORMAT
      'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat';

      insert overwrite table parquet_tbl partition (lyear='2016') select
      1,
      '2016-02-03' from src limit 1;

      set hive.optimize.ppd.storage = true;
      set hive.optimize.ppd = true;
      select * from parquet_tbl where ldate between '2016-02-03' and '2016-02-03';

      No row will be returned in a cluster.
      But if you turn off hive.optimize.ppd, one row will be returned.

      Attachments

        1. HIVE-13039.1.branch1.txt
          5 kB
          Yongzhi Chen
        2. HIVE-13039.1.patch
          5 kB
          Yongzhi Chen
        3. HIVE-13039.2.branch-1.txt
          6 kB
          Yongzhi Chen
        4. HIVE-13039.2.patch
          7 kB
          Yongzhi Chen
        5. HIVE-13039.3.patch
          9 kB
          Yongzhi Chen

        Issue Links

          Activity

            People

              ychena Yongzhi Chen
              ychena Yongzhi Chen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: