Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Currently,
org.apache.parquet.hadoop.ParquetFileWriter#appendFile(org.apache.parquet.io.InputFile) uses appendRowGroup method to concate parquet row group. However, appendRowGroup method looses column index.
// code placeholder public void appendRowGroup(SeekableInputStream from, BlockMetaData rowGroup, boolean dropColumns) throws IOException { .... // TODO: column/offset indexes are not copied // (it would require seeking to the end of the file for each row groups) currentColumnIndexes.add(null); currentOffsetIndexes.add(null); }
Look forward to functionality that support append with page index.