[PIG-2397] Running TPC-H Benchmark on Pig - ASF JIRA

XML

Word

Printable

JSON

For a class project we developed a whole set of Pig scripts for TPC-H. Our goals are:

1) identifying the bottlenecks of Pig's performance especially of its relational operators,

2) studying how to write efficient scripts by making full use of Pig Latin's features,

3) comparing with Hive's TPC-H results for verifying both 1) and 2).

We will update the JIRA with our scripts, results and analysis soon.

is related to

PIG-1324 Logical Optimizer: Nested column pruning

PIG-410 PERFORMANCE: delay type conversion

PIG-2228 support partial aggregation in map task

relates to

PIG-2423 document use case where co-group is better choice than join