Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1291

[Python] pa.RecordBatch.from_pandas doesn't accept DataFrame with numeric column names

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 0.5.0
    • 0.6.0
    • Python
    • None

    Description

      import pyarrow as pa
      import pandas as pd
      
      df = pd.DataFrame([1])
      pa.RecordBatch.from_pandas(df)
      

      Exception:

      TypeError                                 Traceback (most recent call last)
      <ipython-input-5-670ba4a2ddb2> in <module>()
            3 
            4 df = pd.DataFrame([1])
      ----> 5 pa.RecordBatch.from_pandas(df)
      
      table.pxi in pyarrow.lib.RecordBatch.from_pandas()
      
      table.pxi in pyarrow.lib._dataframe_to_arrays()
      
      /home/icexelloss/miniconda3/envs/spark-dev/lib/python3.5/site-packages/pyarrow/pandas_compat.py in construct_metadata(df, index_levels, preserve_index, types)
          187                         arrow_type=arrow_type
          188                     )
      --> 189                     for name, arrow_type in zip(df.columns, df_types)
          190                 ] + (
          191                     [
      
      /home/icexelloss/miniconda3/envs/spark-dev/lib/python3.5/site-packages/pyarrow/pandas_compat.py in <listcomp>(.0)
          187                         arrow_type=arrow_type
          188                     )
      --> 189                     for name, arrow_type in zip(df.columns, df_types)
          190                 ] + (
          191                     [
      
      /home/icexelloss/miniconda3/envs/spark-dev/lib/python3.5/site-packages/pyarrow/pandas_compat.py in get_column_metadata(column, name, arrow_type)
          125         raise TypeError(
          126             'Column name must be a string. Got column {} of type {}'.format(
      --> 127                 name, type(name).__name__
          128             )
          129         )
      
      TypeError: Column name must be a string. Got column 0 of type int64
      

      Attachments

        Activity

          People

            wesm Wes McKinney
            icexelloss Li Jin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: