Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
>>> list_arr = pa.array([[1, 2], [3, 4, 5]]) >>> int_arr = pa.array([10, 11]) >>> table = pa.Table.from_arrays([int_arr, list_arr], ['ints', 'lists']) >>> bio = io.BytesIO() >>> pq.write_table(table, bio) >>> bio.seek(0) 0 >>> reader = pq.ParquetReader() >>> reader.open(bio) >>> reader.scan_contents() Traceback (most recent call last): File "<ipython-input-23-58e977f6d60b>", line 1, in <module> reader.scan_contents() File "_parquet.pyx", line 753, in pyarrow._parquet.ParquetReader.scan_contents File "error.pxi", line 79, in pyarrow.lib.check_status ArrowIOError: Parquet error: Total rows among columns do not match
ScanFileContents() claims it returns the "number of semantic rows" but apparently it actually counts the number of physical elements?