Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently, creating an ExtensionArray from a python sequence (or numpy array, ..) requires the following:
from pyarrow.tests.test_extension_type import IntegerType storage_array = pa.array([1, 2, 3]) ext_arr = pa.ExtensionArray.from_storage(IntegerType(), storage_array)
While doing this directly in pa.array(..) doesn't work:
>>> pa.array([1, 2, 3], type=IntegerType())
ArrowNotImplementedError: extension
I think it should be possible to basically to the ExtensionArray.from_storage under the hood in pa.array(..) when the specified type is an extension type?
I think this should also enable converting from a pandas DataFrame (with a column with matching storage values) to a Table with a specified schema that includes an extension type. Like:
df = pd.DataFrame({'a': [1, 2, 3]}) pa.table(df, schema=pa.schema([('a', IntegerType())]))
Attachments
Issue Links
- is related to
-
ARROW-17813 [Python] Nested ExtensionArray conversion to/from pandas/numpy
- Resolved