Description
Parquet is a columnar storage format developed by Twitter. Implement Parquet (http://parquet.io/) support for Tajo.
The implementation consists of the following:
- ParquetScanner and ParquetAppender - FileScanner and FileAppenders for reading and writing Parquet.
- TajoParquetReader and TajoParquetWriter - Top-level reader and writer for serializing/deserializing to Tajo Tuples.
- TajoReadSupport and TajoWriteSupport - Abstractions to perform conversion between Parquet and Tajo records.
- TajoRecordMaterializer - Materializes Tajo Tuples from Parquet's internal representation.
- TajoRecordConverter - Used by TajoRecordMateriailzer to materialize a Tajo Tuple.
- TajoSchemaConverter - Converts between Tajo and Parquet schemas.