Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1732

[Python] RecordBatch.from_pandas fails on DataFrame with no columns when preserve_index=False

    XMLWordPrintableJSON

Details

    Description

      I believe this should have well-defined behavior and not raise an error:

      In [5]: pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
      ---------------------------------------------------------------------------
      ValueError                                Traceback (most recent call last)
      <ipython-input-5-4dda72b47dbd> in <module>()
      ----> 1 pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
      
      ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_pandas (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:39957)()
          586             df, schema, preserve_index, nthreads=nthreads
          587         )
      --> 588         return cls.from_arrays(arrays, names, metadata)
          589 
          590     @staticmethod
      
      ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_arrays (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:40130)()
          615 
          616         if not number_of_arrays:
      --> 617             raise ValueError('Record batch cannot contain no arrays (for now)')
          618 
          619         num_rows = len(arrays[0])
      
      ValueError: Record batch cannot contain no arrays (for now)
      

      Attachments

        Issue Links

          Activity

            People

              wesm Wes McKinney
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: