Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2321

allow customized buffer size when creating ArrowInputStream for a column PageReader

    XMLWordPrintableJSON

Details

    Description

      When buffered stream is enabled, all column chunks, regardless of their actual sizes, are currently sharing the same buffer size which is stored in the shared [read properties](https://github.com/apache/arrow/blob/main/cpp/src/parquet/file_reader.cc#L213).  

      Given a limited memory budget, one may want to customize buffer size for different column chunks based on their actual size, i.e., smaller chunks will use consume less memory budget for its buffer.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              zjpzlz Jinpeng Zhou
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 6.5h
                  6.5h