Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-377

Python: Add support for conversion of Pandas.Categorical

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.2.0
    • Python

    Description

      At the moment conversion from pandas.Categorical columns fails with ArrowException: Invalid: only handle 1-dimensional arrays. As a better alternative, we should provide one of the following solutions:

      • Convert the categorical column to a string (Pandas type object) column, then use the conversion routines for strings. Add some metadata to the Arrow column that it was initially a Pandas string column so that in the case of a roundtrip, it will be a categorical column again.
      • Implement the conversion of the column to a dictionary-encoded Arrow column. This is the preferred solution but may be more complicated to implement as certain requirements have not yet been implemented.

      Attachments

        Activity

          People

            wesm Wes McKinney
            uwe Uwe Korn
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: