Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16138 [C++] Improve performance of ExecuteScalarExpression
  3. ARROW-16599

[C++] Implementation of ExecuteScalarExpressionOverhead benchmarks without arrow for comparision

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 9.0.0
    • C++

    Description

      The ExecuteScalarExpressionOverhead group of benchmarks for now gives us values we can compare to different batch sizes, or to different expressions. But we don't really see how well arrow does compared to what is possible in general.

      The simple_expression and (negate x) complex_expression (x>0 and x<20) benchmarks, which perform an actual operation on data, can be implemented in pure C++ for comparison.

      I implemented complex_expression benchmark using technically unnecessary intermediate buffers for the > and < operator results, to match what happens in the arrow expression.

      What may seem unfair is that I currently re-use the input/output/intermediate buffers over all iterations. I also tried using new and delete each time, but could not measure a difference in performance. Reusing allowes to use std::vector for sightly cleaner code. Re-creating a vector each time would results in a lot of overhead initializing the vector values and is therefore not useful.

      Example output: example-output-baseline.txt

      Attachments

        1. example-output-baseline.txt
          7 kB
          Tobias Zagorni

        Issue Links

          Activity

            People

              zagto Tobias Zagorni
              zagto Tobias Zagorni
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h