Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Writing a tiny parquet file, to read in its metadata (to obtain a FileMetaData object):
import pyarrow as pa import pyarrow.parquet as pq table = pa.table({'a': [1, 2, 3], 'b': [4, 5, 6]}) pq.write_table(table, "test_file_for_metadata.parquet") metadata = pq.read_metadata("test_file_for_metadata.parquet") metadata.append_row_groups(metadata)
The last line using AppendRowGroups (appending the metadata object to itself) keeps running with increasing memory usage (I killed the process when it was using 10 GB).
This is not something useful to do, but still I wouldn't expect it to blow up (as one can accidentally do it; I was actually trying it in a attempt to create a large FileMetaData object).
Attachments
Issue Links
- links to