Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Invalid
-
1.0.0
-
None
-
None
-
Linux
Description
I would like to write down a dataframe into a parquet file.
The problem that I have is the output dataframe shows up as
```0 {'field0': 5, 'field1': 8}
1 {'field0': 5, 'field1': 8}
2 {'field0': 4, 'field1': 7}```
while what I want is
```0 {'A': 5, 'B': 8}
1 {'A': 5, 'B': 8}
2 {'A': 4, 'B': 7}```
As I understand the discrepancy is because I did not pass the metadata in the creation of the table. That is I did
schema_metadata = ::arrow::key_value_metadata("pandas", metadata.data());
schema = std::make_shared<arrow::Schema>(schema_vector, schema_metadata);
arrow_table = arrow::Table::Make(schema, columns, row_group_size);
status = parquet::arrow::WriteTable( *arrow_table, pool, out_stream, row_group_size, writer_properties, ...)
The problem is that I could not find any documentation on how the metadata is to be built. Adding documentation would be much helpful.