[PIG-4135] Fetch optimization should be disabled if plan contains no limit - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.14.0
Component/s: None
Labels:
None

Description

After deploying fetch optimization in production, a couple of users ran into this situation. They had fairly large input data, but after filtering it by a regular expression, it becomes small. So they didn't add limit to the query.

The problem is that even though the output is small, processing the input must be done in the cluster not in the client. However, fetch optimization blindly fetches the entire input into the client since the plan is map-only job and finishes with dump.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-4135-1.patch
20/Aug/14 23:33
3 kB
Cheolsoo Park

Issue Links

is related to

PIG-3642 Direct HDFS access for small jobs (fetch)

Closed

Activity

People

Assignee:: Cheolsoo Park

Reporter:: Cheolsoo Park

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 20/Aug/14 23:27

Updated:: 21/Nov/14 05:59

Resolved:: 21/Aug/14 21:24