Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-4823

[Python] read_csv shouldn't close file handles it doesn't own

    XMLWordPrintableJSON

    Details

      Description

      If a file-handle is passed into `read_csv` it is automatically closed:

       

      In [47]: csv = io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value
          ...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27 03:00,29.1
          ...: ''')

      In [48]: pa.csv.read_csv(csv, convert_options=opts)
      {{Out[48]: }}
      pyarrow.Table
      issue_date_utc: timestamp[ns]
      variable_name: string
      station_name: string
      station_id: int64
      value_date_utc: string
      value: double

      In [49]: csv.seek(0)
      Traceback (most recent call last):

        File "<ipython-input-50-0644e6e50712>", line 1, in <module>
          csv.seek(0)

      ValueError: I/O operation on closed file.

       

      This behaviour is in contrast to pandas which leaves the file handle open.

      Since the function didn't create the file handle I don't think it should close it.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wesm Wes McKinney
                Reporter:
                dhirschfeld Dave Hirschfeld
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m