Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10425

[Python] Support reading (compressed) CSV file from remote file / binary blob

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Python

    Description

      From https://stackoverflow.com/questions/64588076/how-can-i-read-a-csv-gz-file-with-pyarrow-from-a-file-object

      Currently pyarrow.csv.rad_csv happily takes a path to a compressed file and automatically decompresses it, but AFAIK this only works for local paths.

      It would be nice to in general support reading CSV from remote files (with URI / specifying a filesystem), and in that case also support compression.

      In addition we could also read a compressed file from a BytesIO / file-like object, but not sure we want that (as it would required a keyword to indicate the used compression).

      Attachments

        Activity

          People

            Unassigned Unassigned
            jorisvandenbossche Joris Van den Bossche
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: