Details
-
Improvement
-
Status: Resolved
-
P3
-
Resolution: Won't Fix
-
2.14.0
-
None
Description
Current ParquetIO supports neither column projection nor filter predicate which defeats the performance motivation of using Parquet in the first place. That's why we have our own implementation of ParquetIO in Scio.
Reading Parquet as Avro with column projection has some complications, namely, the resulting Avro records may be incomplete and will not survive ser/de. A workaround maybe provide a TypedRead interface that takes a Function<A, B> that maps invalid Avro A into user defined type B.
Attachments
Issue Links
- blocks
-
BEAM-7929 ParquetTable.buildIOReader should support column projection and filter predicate
-
- Open
-
- links to