Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Resolved
-
None
-
None
Description
We are seeing that for a relatively complex filter expression, Drill may spend ~ 10-20 seconds on run-time code compilation. Although it's true that the expression is a bit complex and the next same query would enjoy the benefit of cached run-time code, we still may have to consider way to reduce the time spent on the compilation when the query is processed for the 1st time, in order to make Drill more "interactive".
For instance, for the following query, the log shows it spent 7 seconds on code compilation, while the query returns within 9 seconds.
select * from cp.`tpch/nation.parquet` t WHERE ((( cast(substring(t.CID, 9) as integer) = 28) AND ((( cast(substring(t.LID, 8) as integer) = 1) AND ((( cast(substring(t.TYEAR, 11) as integer) = 2011) AND (( cast(t.GRP_ID as integer) IN (1, 2)) AND ( cast(t.VM_DATEID as integer) = 20111201))) OR (( cast(t.TYEAR as integer) = 2012) AND (( cast(t.GRP_ID as integer) IN (1, 2)) AND ( cast(t.VM_DATEID as integer)= 20121201))) OR (( cast(substring(t.TYEAR, 11) as integer) = 2013) AND (( cast(t.GRP_ID as integer) IN (1, 2)) AND ( cast(t.VM_DATEID as integer) = 20131201))))) OR (( cast(substring(t.LID, 8) as integer) = 2) AND ((( cast(substring(t.TYEAR, 11) as integer) = 2011) AND (( cast(t.GRP_ID as integer) IN (-1, 1, 2)) AND ( cast(t.VM_DATEID as integer)= 20111201))) OR (( cast(t.TYEAR as integer) = 2013) AND (( cast(t.GRP_ID as integer) IN (-1, 1, 2)) AND ( cast(t.VM_DATEID as integer) = 20131201))))))) AND (( cast(t.FLAG as integer) = -1) OR ( cast(t.FLAG as integer) = 0)) )