[DRILL-6301] Parquet Performance Analysis - ASF JIRA

XML

Word

Printable

JSON

Description -

~~DRILL-5846~~ is meant to improve the Flat Parquet reader performance
The associated implementation resulted in a 2x - 4x performance improvement
Though during the review process ([pull request|https://github.com/apache/drill/pull/1060)] few key questions arised

Intermediary Processing via Direct Memory vs Byte Arrays

The main reasons for using byte arrays for intermediary processing is to a) avoid the high cost of the DrillBuf checks (especially the reference counting) and b) benefit from some observed Java optimizations when accessing byte arrays
Starting with version 1.12.0, the DrillBuf enablement checks have been refined so that memory access and reference counting checks can be enabled independently
Benchmarking of Java's Direct Memory unsafe method using JMH indicates the performance gap between heap vs direct memory is very narrow except for few use-cases
There are also concerns that the extra copy step (from direct memory into byte arrays) will have a negative effect on performance; note that this overhead was not observed using Intel's Vtune as the intermediary buffer were a) pinned to a single CPU, b) reused, and c) small enough to remain in the L1 cache during columnar processing.

Goal

The Flat Parquet reader is amongst the few Drill columnar operators
It is imperative that we agree on the most optimal processing pattern so that the decisions that we take within this Jira are not only applied to Parquet but to all Drill columnar operators

Methodology

Assess the performance impact of using intermediary byte arrays (as described above)
Prototype a solution using Direct Memory and DrillBuf checks off, access checks on, all checks on
Make an educated decision on which processing pattern should be adopted
Decide whether it is ok to use Java's unsafe API (and through what mechanism) on byte arrays (when the use of byte arrays is a necessity)