The following script missing the column name i after merge.
a = load 'num.txt' as (i);
b = foreach a generate (int)i;
c = foreach b generate i + 60 as i;
store c into 'sectest';
[exec] +1 overall.
[exec] +1 @author. The patch does not contain any @author tags.
[exec] +1 tests included. The patch appears to include 3 new or modified tests.
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
All tests pass.
With this fix, pig does not merge the foreach statements if first foreach has a user defined schema. We should consider supporting that case as well (in a separate jira), because users sometimes add a foreach statement just to give convenient names for columns .
J = join A by col1, B by col1
F1 = foreach J generate A::col1 as Acol1 , A::col2 as col2, B::col1 as Bcol1; -- foreach that has been added just to give convenient names for expressions.
F2 = foreach F1 generate Acol1+col2, Bcol1 + col2 ;
Patch committed to both trunk and 0.8 branch. Future improvement is possible as per Thejas's comment.