Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.12.0, 0.11.1
-
None
Description
In some cases, Partition Pruning doesn't handle Constant folding for {{BETWEEN|| clause as shown below.
default> CREATE EXTERNAL TABLE student ( id INT, name TEXT, grade TEXT, phone TEXT ) USING TEXT PARTITION BY COLUMN (entrance_date TEXT) LOCATION '/data/student'; default> \dfs -ls /data/student; Found 6 items drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160308 drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160309 drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160310 drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160311 drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160312 drwx------ - tajo supergroup 0 2016-03-14 11:13 /data/student/entrance_date=20160313 default> \! date 2016. 03. 14. (MON) 15:19:27 KST default> SELECT * FROM student WHERE entrance_date BETWEEN TO_CHAR(ADD_DAYS(CURRENT_DATE(), -2) , 'YYYYMMDD') AND TO_CHAR( ADD_DAYS(CURRENT_DATE(), -1) , 'YYYYMMDD');
Above select query returns two rows as expected. But when I found logical plan on Tajo web UI, tajo retrieves all sub-directories of the example table as following:
PARTITIONS_SCAN(4) on default.student => filter: default.student.entrance_date (TEXT) BETWEEN ASYMMETRIC to_char(add_days(current_date(),-2),YYYYMMDD) AND to_char(add_days(current_date(),-1),YYYYMMDD) => target list: default.student.id (INT4), default.student.name (TEXT), default.student.grade (TEXT), default.student.phone (TEXT), default.student.entrance_date (TEXT) => num of filtered paths: 6 => out schema: {(5) default.student.id (INT4), default.student.name (TEXT), default.student.grade (TEXT), default.student.phone (TEXT), default.student.entrance_date (TEXT)} => in schema: {(4) default.student.id (INT4), default.student.name (TEXT), default.student.grade (TEXT), default.student.phone (TEXT)} => 0: hdfs://localhost:9010/data/student/entrance_date=20160308 => 1: hdfs://localhost:9010/data/student/entrance_date=20160309 => 2: hdfs://localhost:9010/data/student/entrance_date=20160310 => 3: hdfs://localhost:9010/data/student/entrance_date=20160311 => 4: hdfs://localhost:9010/data/student/entrance_date=20160312 => 5: hdfs://localhost:9010/data/student/entrance_date=20160313