Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-8782

[Rust] [DataFusion] Add benchmarks based on NYC Taxi data set

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I plan on adding a new benchmarks folder beneatch the datafusion crate, containing benchmarks based on the NYC Taxi data set. The benchmark will be a CLI and will support running a number of different queries against CSV and Parquet.

      The README will contain instructions for downloading the data set.

      The benchmark will produce CSV files containing results.

      These benchmarks will allow us to manually verify performance before major releases and on an ongoing basis as we make changes to Arrow/Parquet/DataFusion.

      I will be basing this on existing benchmarks I recently built in Ballista [1] (I am the only contributor to these benchmarks so far).

      A dockerfile will be provided, making it easy to restrict CPU and RAM when running these benchmarks.

      [1] https://github.com/ballista-compute/ballista/tree/master/rust/benchmarks

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            andygrove Andy Grove Assign to me
            andygrove Andy Grove
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1.5h
                1.5h

                Slack

                  Issue deployment