Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5952

[Python] Segfault when reading empty table with category as pandas dataframe

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.14.0, 0.14.1
    • 0.15.0
    • Python
    • Linux 3.10.0-327.36.3.el7.x86_64
      Python 3.6.8
      Pandas 0.24.2
      Pyarrow 0.14.0

    Description

      I have two short sample programs which demonstrate the issue:

      import pyarrow as pa
      import pandas as pd
      empty = pd.DataFrame({'foo':[]},dtype='category')
      table = pa.Table.from_pandas(empty)
      outfile = pa.output_stream('bar')
      writer = pa.RecordBatchFileWriter(outfile,table.schema)
      writer.write(table)
      writer.close()
      
      import pyarrow as pa
      pa.ipc.open_file('bar').read_pandas()
      Segmentation fault
      

      My apologies if this was already reported elsewhere, I searched but could not find an issue which seemed to refer to the same behavior.

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              nugend Daniel Nugent
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h