Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1074

from_pandas doesnt convert ndarray to list

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.4.0
    • Fix Version/s: 0.5.0
    • Component/s: Python
    • Labels:

      Description

      [Feel free to change issue type because this is probably by design]

      I have noticed that that if the one of the columns in the parquet file is of type array, pyarrow table stores it as list
      >>> table[3].type
      DataType(list<element: string>)
      If I do a .to_pandas() on the column, I get something like this
      >> table[3].to_pandas()
      0 None
      1 [7]
      2 [46]
      dtype: object

      However, I cant do a pyarrow.Table.from_pandas from a dataframe having the above ndarray as a series/column. I get this error
      Invalid: Python object of type ndarray is not None and is not a string, bool, float, int, date,
      decimal object

      If to_pandas() can covert a list to ndarray, shouldnt from_pandas also convert an ndarray to type list in the table ?

        Attachments

          Activity

            People

            • Assignee:
              fjetter Florian Jetter
              Reporter:
              abdulrahman004 Abdul Rahman
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: