Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10135

[Rust] [Parquet] Refactor file module to help adding sources

    XMLWordPrintableJSON

Details

    Description

      Currently, the Parquet reader is very strongly tied to file system reads. This makes it hard to add other sources. For instance, to implement S3, we would need a reader that loads entire columns at once rather than buffered reads of a few Ko.

      To improve modularity, we could try to move as much logic as possible to the generic traits (FileReader, RowGroupReader...) and reduce the code in the implementing structs (SerializedFileReader, SerializedRowGroupReader...) to the part that is specific to file/buffered reads.

      Attachments

        Issue Links

          Activity

            People

              rdettai Rémi Dettai
              rdettai Rémi Dettai
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 7.5h
                  7.5h