Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17313

[C++] Add Byte Range to CSV Reader ReadOptions

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • C++, Python

    Description

      Sometimes it's desirable to just read a portion of a CSV. The best way to do that is to pass in a list of byte ranges to CSV read options that specify where in the CSV you want to read. These byte ranges don't necessarily have to be aligned on line break boundaries, the CSV reader should just read until the end of the line, and skip anything before the first line break in a byte range.  

      Based on discussion, the scope is going to be reduced here. The first implementation will support a single byte range that is already assumed to be aligned on byte boundaries.

      Will not handle quotes/returns and other edge cases.

      Attachments

        Activity

          People

            Unassigned Unassigned
            marsupialtail Ziheng Wang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5h 40m
                5h 40m