Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1732

[Python] RecordBatch.from_pandas fails on DataFrame with no columns when preserve_index=False

    Details

      Description

      I believe this should have well-defined behavior and not raise an error:

      In [5]: pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
      ---------------------------------------------------------------------------
      ValueError                                Traceback (most recent call last)
      <ipython-input-5-4dda72b47dbd> in <module>()
      ----> 1 pa.RecordBatch.from_pandas(pd.DataFrame({}), preserve_index=False)
      
      ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_pandas (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:39957)()
          586             df, schema, preserve_index, nthreads=nthreads
          587         )
      --> 588         return cls.from_arrays(arrays, names, metadata)
          589 
          590     @staticmethod
      
      ~/code/arrow/python/pyarrow/table.pxi in pyarrow.lib.RecordBatch.from_arrays (/home/wesm/code/arrow/python/build/temp.linux-x86_64-3.5/lib.cxx:40130)()
          615 
          616         if not number_of_arrays:
      --> 617             raise ValueError('Record batch cannot contain no arrays (for now)')
          618 
          619         num_rows = len(arrays[0])
      
      ValueError: Record batch cannot contain no arrays (for now)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                wesmckinn Wes McKinney
                Reporter:
                wesmckinn Wes McKinney
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: