Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15332

[C++] Add new cases and fix issues in IPC read/write benchmark

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 7.0.0
    • C++

    Description

      This breaks out the benchmark changes in ARROW-14577 to allow us to more easily demonstrate the effect of the PR.

      First, there are a few problems with the current benchmark:

      • The benchmark named ReadFile is misleading since it is actually reading from an in-memory buffer and no OS "read" call is ever issued.
      • Renamed ReadTempFile to ReadCachedFile and added a second case for ReadUncachedFile. The former reads a file in the OS' page cache and the latter forces a read to actually hit the disk.
      • The TempFile benchmarks were not actually writing the correct amount of data and were reporting unrealistically high rates as a result.
      • Adding a "partial read" parameter which, when true, only reads 1/8 the columns in the file so we can see the impact of pushdown projection.
      • Slightly reduced the range of parameters to keep the benchmark time reasonable (8k columns wasn't telling us anything more than 4k columns).

      Attachments

        Issue Links

          Activity

            People

              westonpace Weston Pace
              westonpace Weston Pace
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 50m
                  3h 50m