[PIG-627] PERFORMANCE: multi-query optimization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.2.0
Fix Version/s: 0.3.0
Component/s: None
Labels:
None

Description

Currently, if your Pig script contains multiple stores and some shared computation, Pig will execute several independent queries. For instance:

A = load 'data' as (a, b, c);
B = filter A by a > 5;
store B into 'output1';
C = group B by b;
store C into 'output2';

This script will result in map-only job that generated output1 followed by a map-reduce job that generated output2. As the resuld data is read, parsed and filetered twice which is unnecessary and costly.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

doc-fix.patch
15/Apr/09 08:33
5 kB
Gunther Hagleitner
error_handling_0415.patch
16/Apr/09 04:44
27 kB
Gunther Hagleitner
error_handling_0416.patch
17/Apr/09 02:54
27 kB
Gunther Hagleitner
file_cmds-0305.patch
06/Mar/09 05:22
33 kB
Gunther Hagleitner
fix_store_prob.patch
26/Mar/09 04:51
26 kB
Gunther Hagleitner
merge_741727_HEAD__0324_2.patch
24/Mar/09 22:15
595 kB
Gunther Hagleitner
merge_741727_HEAD__0324.patch
24/Mar/09 21:16
591 kB
Gunther Hagleitner
merge_trunk_to_branch.patch
08/Apr/09 01:12
13 kB
Gunther Hagleitner
merge-041409.patch
14/Apr/09 19:00
21 kB
Gunther Hagleitner
multiquery_0223.patch
23/Feb/09 19:08
110 kB
Gunther Hagleitner
multiquery_0224.patch
25/Feb/09 01:10
146 kB
Gunther Hagleitner
multiquery_0306.patch
07/Mar/09 01:52
32 kB
Richard Ding
multiquery_explain_fix.patch
19/Mar/09 22:20
3 kB
Gunther Hagleitner
multiquery-phase2_0313.patch
13/Mar/09 21:08
86 kB
Richard Ding
multiquery-phase2_0323.patch
23/Mar/09 20:12
88 kB
Richard Ding
multiquery-phase3_0423.patch
24/Apr/09 00:42
77 kB
Richard Ding
multi-store-0303.patch
03/Mar/09 21:18
77 kB
Gunther Hagleitner
multi-store-0304.patch
05/Mar/09 00:11
78 kB
Gunther Hagleitner
non_reversible_store_load_dependencies_2.patch
05/Apr/09 03:16
90 kB
Gunther Hagleitner
non_reversible_store_load_dependencies.patch
02/Apr/09 22:46
76 kB
Gunther Hagleitner
noop_filter_absolute_path_flag_0401.patch
01/Apr/09 08:29
125 kB
Gunther Hagleitner
noop_filter_absolute_path_flag.patch
31/Mar/09 00:11
88 kB
Gunther Hagleitner
streaming-fix.patch
14/Apr/09 06:33
10 kB
Gunther Hagleitner

Activity

People

Assignee:: Gunther Hagleitner

Reporter:: Olga Natkovich

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 21/Jan/09 00:39

Updated:: 24/Mar/10 22:10

Resolved:: 04/May/09 20:51