[PARQUET-131] Vectorized Reader In Parquet - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: parquet-mr
Labels:
None

Description

Vectorized Query Execution could have big performance improvement for SQL engines like Hive, Drill, and Presto. Instead of processing one row at a time, Vectorized Query Execution could streamline operations by processing a batch of rows at a time. Within one batch, each column is represented as a vector of a primitive data type. SQL engines could apply predicates very efficiently on these vectors, avoiding a single row going through all the operators before the next row can be processed.
As an efficient columnar data representation, it would be nice if Parquet could support Vectorized APIs, so that all SQL engines could read vectors from Parquet files, and do vectorized execution for Parquet File Format.

Detail proposal:
https://gist.github.com/zhenxiao/2728ce4fe0a7be2d3b30

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ParquetInPresto.pdf
02/Dec/14 08:17
79 kB
Zhenxiao Luo
Parquet-Vectorized-APIs.pdf
01/Dec/14 11:42
591 kB
Dong Chen

Issue Links

is related to

HIVE-8128 Improve Parquet Vectorization

Patch Available

Sub-Tasks

1.	[Vectorized Reader] Support Complex Types (Map, Array, Struct) in Parquet Vectorized Reader	In Progress	Nezih Yigitbasi
2.	[Vectorized Reader] ColumnVector length should be in terms of rows, not DataPages	In Progress	Nezih Yigitbasi
3.	[Vectorized Reader] Make sure all encodings work in Parquet Vectorized Reader	In Progress	Nezih Yigitbasi
4.	[Vectorized Reader] Lazy Load in Vectorized Reader	Open	Unassigned
5.	[Vectorized Reader] Lazy Decoding in Vectorized Reader	Open	Unassigned
6.	[Vectorized Reader] Add Testcases/Benchmarks for ParquetVectorizedReader	In Progress	Nezih Yigitbasi
7.	[Vectorized Reader] Add attributes in ColumnVector and RowBatch	Open	Nezih Yigitbasi

Activity

People

Assignee:: Zhenxiao Luo

Reporter:: Zhenxiao Luo

Votes:: 1 Vote for this issue

Watchers:: 36 Start watching this issue

Dates

Created:: 11/Nov/14 00:25

Updated:: 23/Jun/24 03:27