Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-1743

[Python] Table to_pandas fails when index contains categorical column

    XMLWordPrintableJSON

Details

    Description

      Categorical columns in the index of a dataframe are causing a roundtrip failure.

      >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]})
      >>> df['a'] = df.a.astype('category')
      >>> df = df.set_index('a')
      >>> tbl = pa.Table.from_pandas(df)
      >>> tbl.to_pandas()
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "table.pxi", line 881, in pyarrow.lib.Table.to_pandas
        File "C:\Users\bpendlet\Miniconda3\envs\panpy3\lib\site-packages\pyarrow\pandas_compat.py", line 303, in table_to_blockmanager
          if not values.flags.writeable:
      AttributeError: 'Categorical' object has no attribute 'flags'
      

      Works as expected when you don't change have the categorical:

      >>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [1, 2, 3]})
      >>> df = df.set_index('a')
      >>> tbl = pa.Table.from_pandas(df)
      >>> tbl.to_pandas()
         b
      a
      1  1
      2  2
      3  3
      

      Attachments

        Issue Links

          Activity

            People

              Licht-T Licht Takeuchi
              brianpendleton Brian Pendleton
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: