Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-15589

[C++] Add support for sliced Substrait reads

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++

    Description

      The Substrait format allows for "sliced reads" which only read a part of the file and would most likely be used if a read operation were distributed across multiple files.

      For each file a start byte and length is specified. For files that contain indivisible "groups" (e.g. Parquet row groups) this is handled by picking some heuristic. For example, read all row groups whose midpoint is contained in the interval.

      Attachments

        Activity

          People

            Unassigned Unassigned
            westonpace Weston Pace
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: