Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3801

[Python] Pandas-Arrow roundtrip makes pd categorical index not writeable

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Not A Problem
    • 0.10.0
    • 0.14.0
    • C++, Python
    • None

    Description

      Serializing and deserializing a pandas series with categorical dtype will make the categorical index non-writeable, which in turn trips up pandas when e.g. reordering the categories, raising "ValueError: buffer source array is read-only" :

      import pandas as pd
      import pyarrow as pa
      
      df = pd.Series([1,2,3], dtype='category', name="c1").to_frame()
      print("DType before:", repr(df.c1.dtype))
      print("Writeable:", df.c1.cat.categories.values.flags.writeable)
      ro = df.c1.cat.reorder_categories([3,2,1])
      print("DType reordered:", repr(ro.dtype), "\n")
      
      tbl = pa.Table.from_pandas(df)
      df2 = tbl.to_pandas()
      print("DType after:", repr(df2.c1.dtype))
      print("Writeable:", df2.c1.cat.categories.values.flags.writeable)
      ro = df2.c1.cat.reorder_categories([3,2,1])
      print("DType reordered:", repr(ro.dtype), "\n")
      

       

      Outputs:

       

      DType before: CategoricalDtype(categories=[1, 2, 3], ordered=False)
      Writeable: True
      DType reordered: CategoricalDtype(categories=[3, 2, 1], ordered=False)
      DType after: CategoricalDtype(categories=[1, 2, 3], ordered=False)
      Writeable: False
      ---------------------------------------------------------------------------
      ValueError Traceback (most recent call last)
      <ipython-input-365-85b439586c1a> in <module>
       12 print("DType after:", repr(df2.c1.dtype))
       13 print("Writeable:", df2.c1.cat.categories.values.flags.writeable)
      ---> 14 ro = df2.c1.cat.reorder_categories([3,2,1])
       15 print("DType reordered:", repr(ro.dtype), "\n")
      

       

       
       

      Attachments

        Activity

          People

            Unassigned Unassigned
            buhrmann Thomas Buhrmann
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: