Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5428

[C++] Add option to set "read extent" in arrow::io::BufferedInputStream

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • C++
    • None

    Description

      I'm looking at simplifying libparquet to use common IO interfaces rather than its own custom ones

      The parquet::BufferedInputStream interface has an option to not read beyond a particular number of bytes. For example, if we were reading a 32MB block with 1MB buffering, then we will not consume more than 32MB from the raw InputStream.

      This seems like a fairly trivial addition to arrow::io::BufferedInputStream to track total read bytes and do not read beyond the configured extent. We'll have to add a method like set_read_extent

      Attachments

        Activity

          People

            wesm Wes McKinney
            wesm Wes McKinney
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: