Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-16580

[Python] Multiindex levels order is not preserved after a from_pandas/to_pandas

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 8.0.0
    • None
    • Python

    Description

      Hello,

      Not sure if it's the good place to report this issue but here is what I saw when I tried to convert a multi indexed dataframe (columns) to Table and convert it back to pandas:

       

      import pandas as pd
      import pyarrow
      
      # pandas version '1.4.1'
      # pyarrow version '8.0.0'
      
      df = pd.DataFrame([[100,300, 400], [200,500, 600]], columns=['Toyota', 'Ford', 'Alfa'])
      concatenated = pd.concat([df, df*2], axis=1, keys=['foo', 'bar'])
      concatenated.columns.names = ['l1', 'l2']
      
      table = pyarrow.Table.from_pandas(concatenated)
      from_table_df = table.to_pandas()
      
      from_table_df.columns.levels # == FrozenList([['bar', 'foo'], ['Alfa', 'Ford', 'Toyota']])
      concatenated.columns.levels # == FrozenList([['foo', 'bar'], ['Toyota', 'Ford', 'Alfa']])
      
      
      

      the order of columns levels is not preserved. 

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Fabien Aulaire Fabien Aulaire
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: