Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Writing a table that contains an ExtensionType array to a parquet file is not yet implemented. It currently raises "ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema conversion: extension<arrow.py_extension_type>" (for a PyExtensionType in this case).
I think minimal support can consist of writing the storage type / array.
We also might want to save the extension name and metadata in the parquet FileMetadata.
Later on, this could be potentially be used to restore the extension type when reading. This is related to other issues that need to save the arrow schema (categorical: ARROW-5480, time zones: ARROW-5888). Only in this case, we probably want to store the serialised type in addition to the schema (which only has the extension type's name).
Attachments
Issue Links
- links to