Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Given the following table
data = {"root": [[{"addr": {"this": 3, "that": 3}}]]} table = pa.Table.from_pydict(data)
reading the nested column leads to an pyarrow.lib.ArrowInvalid error:
pq.write_table(table, "/tmp/table.parquet") file = pq.ParquetFile("/tmp/table.parquet") array = file.read(["root.list.item.addr.that"])
Traceback:
Traceback (most recent call last): File "....", line 21, in <module> array = file.read(["root.list.item.addr.that"]) File "/home/angus/.mambaforge/envs/awkward/lib/python3.9/site-packages/pyarrow/parquet.py", line 383, in read return self.reader.read_all(column_indices=column_indices, File "pyarrow/_parquet.pyx", line 1097, in pyarrow._parquet.ParquetReader.read_all File "pyarrow/error.pxi", line 97, in pyarrow.lib.check_status pyarrow.lib.ArrowInvalid: List child array invalid: Invalid: Struct child array #0 does not match type field: struct<that: int64> vs struct<that: int64, this: int64>
It's possible that I don't quite understand this properly - am I doing something wrong?
Attachments
Issue Links
- links to