Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-17834

[Python] Allow creating ExtensionArray through pa.array(..) constructor

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 10.0.0
    • Python
    • None

    Description

      Currently, creating an ExtensionArray from a python sequence (or numpy array, ..) requires the following:

      from pyarrow.tests.test_extension_type import IntegerType
      
      storage_array = pa.array([1, 2, 3])
      ext_arr = pa.ExtensionArray.from_storage(IntegerType(), storage_array)
      

      While doing this directly in pa.array(..) doesn't work:

      >>> pa.array([1, 2, 3], type=IntegerType())
      ArrowNotImplementedError: extension
      

      I think it should be possible to basically to the ExtensionArray.from_storage under the hood in pa.array(..) when the specified type is an extension type?

      I think this should also enable converting from a pandas DataFrame (with a column with matching storage values) to a Table with a specified schema that includes an extension type. Like:

      df = pd.DataFrame({'a': [1, 2, 3]})
      pa.table(df, schema=pa.schema([('a', IntegerType())]))
      

      Attachments

        Issue Links

          Activity

            People

              jorisvandenbossche Joris Van den Bossche
              jorisvandenbossche Joris Van den Bossche
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: