ORC-742 introduces lazy evaluation of the non-filter columns in the presence of filters. This builds further on that to convert SArg into filters.
SArg to Filter
SArg to Filter converts the passed SArg into a filter. This enables automatic compatibility with both Spark and Hive as they already push down Search Arguments down to ORC.
The SArg is automatically converted into a Vector Filter. Which is applied during the read process.
The builder for search argument should allow skipping normalization during the build. This has already been proposed as part of
Normalization is very poor in performance in the presence of multilevel predicates.
- fSize identifies the size of the OR clause that will be normalized.
- normalize identifies whether normalize was carried out on the Search Argument.
- Normalizing the search argument results in a significant performance penalty given the explosion of the operator tree
- In case where an AND includes 8 ORs, the unnormalized version is faster by 97.32%
ORC-1382 Fix secondary config names `org.sarg.*` to `orc.sarg.*`
ORC-954 Fix Javadoc generation failure
- depends upon
ORC-742 LazyIO of non-filter columns in the presence of filters
HIVE-24458 Allow access to SArgs without converting to disjunctive normal form
- links to