[SPARK-18540] Wholestage code-gen for ORC Hive tables - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: 2.3.0
Component/s: None
Labels:
None

Description

Spark already optimizes hive parquet tables reading using whole stage code generation.

Similar approach could be used for scanning Hive ORC tables - currently standard hive table scan is used.

ORC is sometimes preferred over parquet in hive ecosystem because of better support and characteristics in certain scenarios

Attachments

Issue Links

blocks

SPARK-20901 Feature parity for ORC with Parquet

Open

duplicates

SPARK-16060 Vectorized ORC reader

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Damian Momot

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 22/Nov/16 12:58

Updated:: 13/Jan/18 00:00

Resolved:: 13/Jan/18 00:00