Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Cannot Reproduce
-
0.5.0
-
None
-
None
Description
It seems there is a bug in PIG when ORDER BY is used twice on the same relation using ASC and DESC
I have the following script:
imei_start = FOREACH sessions GENERATE imei, start;
imei_starts = GROUP imei_start BY imei;
imei_retained_period = FOREACH imei_starts {
ordered_imei_start = ORDER imei_start BY start DESC;
first_start = LIMIT ordered_imei_start 1;
rev_ordered_imei_start = ORDER imei_start BY start ASC;
last_start = LIMIT rev_ordered_imei_start 1;
GENERATE group, ordered_imei_start, rev_ordered_imei_start;
};
ordered_imei_start and rev_ordered_imei_start are actually the same (they are both sorted in the ASC way) and so last_start is always equal to first_start.
If only one of the 2 ORDER BY is performed, there is no issue.