Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1998

[Python] Table.from_pandas crashes when data frame is empty

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.8.0
    • 0.9.0
    • Python
    • Windows 10 Build 15063.850
      Python: 3.6.3
      Numpy: 1.14.0
      Pandas: 0.22.0

    Description

      Loading an empty CSV file, and then attempting to create a PyArrow Table from it makes the application crash. The following code should be able to reproduce the issue:

      import numpy as np
      import pandas as pd
      import pyarrow as pa
      
      FIELDS = ['id', 'name']
      NUMPY_TYPES = {
          'id': np.int64,
          'name': np.unicode
      }
      PYARROW_SCHEMA = pa.schema([
          pa.field('id', pa.int64()),
          pa.field('name', pa.string())
      ])
      
      file = open('input.csv', 'w')
      file.close()
      
      df = pd.read_csv(
          'input.csv',
          header=None,
          names=FIELDS,
          dtype=NUMPY_TYPES,
          engine='c',
      )
      
      pa.Table.from_pandas(df, schema=PYARROW_SCHEMA)
      

      Attachments

        Issue Links

          Activity

            People

              apitrou Antoine Pitrou
              betabandido Victor Jimenez
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: