Details
-
Sub-task
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
None
Description
The ExecuteScalarExpressionOverhead group of benchmarks for now gives us values we can compare to different batch sizes, or to different expressions. But we don't really see how well arrow does compared to what is possible in general.
The simple_expression and (negate x) complex_expression (x>0 and x<20) benchmarks, which perform an actual operation on data, can be implemented in pure C++ for comparison.
I implemented complex_expression benchmark using technically unnecessary intermediate buffers for the > and < operator results, to match what happens in the arrow expression.
What may seem unfair is that I currently re-use the input/output/intermediate buffers over all iterations. I also tried using new and delete each time, but could not measure a difference in performance. Reusing allowes to use std::vector for sightly cleaner code. Re-creating a vector each time would results in a lot of overhead initializing the vector values and is therefore not useful.
Example output: example-output-baseline.txt
Attachments
Attachments
Issue Links
- links to