Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17925

[Python] Use ExtensionScalar.as_py() as fallback in ExtensionArray to_pandas?

Add voteWatch issue
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Python
    • None

    Description

      This was raised in ARROW-17813 by Chang She:

      ExtensionArray => pandas

      Just for discussion, I was curious whether you had any thoughts around using the extension scalar as a fallback mechanism. It's a lot simpler to define an ExtensionScalar with `as_py` than a pandas extension dtype. So if an ExtensionArray doesn't have an equivalent pandas dtype, would it make sense to convert it to just an object series whose elements are the result of `as_py`?

      and I also mentioned this in ARROW-17535:

      That actually brings up a question: if an ExtensionType defines an ExtensionScalar (but not an associciated pandas dtype, or custom to_numpy conversion), should we use this scalar's as_py() for the to_numpy/to_pandas conversion as well for plain extension arrays? (not the nested case)

      Because currently, if you have an ExtensionArray like that (for example using the example from the docs: https://arrow.apache.org/docs/dev/python/extending_types.html#custom-scalar-conversion), we still use the storage type conversion for to_numpy/to_pandas, and only use the scalar's conversion in to_pylist.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jorisvandenbossche Joris Van den Bossche

              Dates

                Created:
                Updated:

                Slack

                  Issue deployment