Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4314

[Rust] Strongly-typed reading of Parquet data

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • None
    • Rust

    Description

      See the proposal I made onĀ csun's repository here for more details.

      This aims to let the user opt in to strong typing and substantial performance improvements (2x-7x, see here) by optionally specifying the type of the records that they are iterating over.

      It is currently a work in progress. All pre-existing tests succeed, bar those in src/record/api.rs which are commented out as they require reworking. Where relevant, pre-existing tests and benchmarks have been duplicated to make new strongly-typed tests and benchmarks, which all also succeed. I've tried to maintain pre-existing APIs where possible. Some changes have been made to better align with prior art in the Rust ecosystem.

      Any feedback while I continue working on it very welcome! Looking forward to hopefully seeing this merged when it's ready.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mocatta Alec Mocatta
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 11h 40m
                  11h 40m