Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8908

[Rust][DataFusion] improve performance of building literal arrays

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • Rust - DataFusion
    • None

    Description

      andygrove I was doing some profiling and noticed a potential performance improvement described below

      NOTE: The issue described below would be irrelevant if it was possible to use scalar comparison operations in DataFusion as described here:
      https://issues.apache.org/jira/browse/ARROW-8907

      the `build_literal_array` function defined here https://github.com/apache/arrow/blob/master/rust/datafusion/src/execution/physical_plan/expressions.rs#L1204
      creates an array of literal values using a loop, but from benchmarks it appears creating an array from vec is much faster
      (about 58 times faster when building an array with 100000 values).
      Here are the benchmark results:

      array builder/array from vec: time: [25.644 us 25.883 us 26.214 us]
      array builder/array from values: time: [1.4985 ms 1.5090 ms 1.5213 ms]

      here is the benchmark code:
      ```
      fn bench_array_builder(c: &mut Criterion) {
      let array_len = 100000;
      let mut count = 0;
      let mut group = c.benchmark_group("array builder");

      group.bench_function("array from vec", |b| b.iter(||

      { let float_array: PrimitiveArray<Float32Type> = vec![1.0; array_len].into(); count = float_array.len(); }

      ));
      println!("built array with {} values", count);

      group.bench_function("array from values", |b| b.iter(|| {
      // let float_array: PrimitiveArray<Float32Type> = build_literal_array(1.0, array_len);
      let mut builder = PrimitiveBuilder::<Float32Type>::new(array_len);
      for _ in 0..count

      { &builder.append_value(1.0); }

      let float_array = builder.finish();
      count = float_array.len();
      }));
      println!("built array with {} values", count);
      }
      ```

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yordan-pavlov Yordan Pavlov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: