Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Hello everyone,
transferring this from Github for Pyarrow. While working with pyarrow I noticed that field metadata does not get carried foreward when creating a table out of several columns. Is this intended behaviour or is there a way to add column metadata later on? The last command in my example does not return anything.
I also could not verify whether this data would be written to parquet later on, because I could not find a way to add field metadata directly to a table.
>>> import pyarrow as pa >>> import pyarrow.parquet as pq >>> arr1 = pa.array([1,2]) >>> arr2 = pa.array([3,4]) >>> field1 = pa.field('field1', pa.int64()) >>> field2 = pa.field('field2', pa.int64()) >>> field1 = field1.add_metadata({'foo1': 'bar1'}) >>> field2 = field2.add_metadata({'foo2': 'bar2'}) >>> field1.metadata {b'foo1': b'bar1'} >>> field2.metadata {b'foo2': b'bar2'} >>> col1 = pa.column(field1, arr1) >>> col2 = pa.column(field2, arr2) >>> col1.field.metadata {b'foo1': b'bar1'} >>> tab = pa.Table.from_arrays([col1, col2]) >>> tab pyarrow.Table field1: int64 field2: int64 >>> tab.column(0).field.metadata
Attachments
Issue Links
- links to