Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5271

[Python] Interface for converting pandas ExtensionArray / other custom array objects to pyarrow Array

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Python
    • Labels:
      None

      Description

      Related to ARROW-2428, which describes the issue to convert back to an ExtensionArray in to_pandas.

      To start supporting to convert custom ExtensionArrays (eg the nullable Int64Dtype in pandas, or the arrow-backed fletcher arrays, ...) to arrow Arrays (eg in pyarrow.array(..)), I think it would be good to define an interface or hook that external projects can implement and that pyarrow will call if available.
      This would allow external projects to define how they can be converted to arrow arrays, without the need that pyarrow itself starts to gather a lot of special cased code for certain types (like pandas' nullable Int64).

      This could similar to how numpy looks for the __array__ method, so we might call it __arrow_array__.

      See also https://github.com/pandas-dev/pandas/issues/20612 for an issue discussing this on the pandas side.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jorisvandenbossche Joris Van den Bossche
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: