Pig
  1. Pig
  2. PIG-1695

MergeForEach does not carry user defined schema if any one of the merged ForEach has user defined schema

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 0.8.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The following script missing the column name i after merge.

      a = load 'num.txt' as (i);
      b = foreach a generate (int)i;
      c = foreach b generate i + 60 as i;
      store c into 'sectest';
      

        Activity

        Hide
        Daniel Dai added a comment -

        Patch committed to both trunk and 0.8 branch. Future improvement is possible as per Thejas's comment.

        Show
        Daniel Dai added a comment - Patch committed to both trunk and 0.8 branch. Future improvement is possible as per Thejas's comment.
        Hide
        Thejas M Nair added a comment -

        +1

        With this fix, pig does not merge the foreach statements if first foreach has a user defined schema. We should consider supporting that case as well (in a separate jira), because users sometimes add a foreach statement just to give convenient names for columns .
        eg-

        J = join A by col1, B by col1
        F1 = foreach J generate A::col1 as Acol1 , A::col2 as col2, B::col1 as Bcol1; -- foreach that has been added just to give convenient names for expressions.
        F2 = foreach F1 generate Acol1+col2, Bcol1 + col2 ;
        
        Show
        Thejas M Nair added a comment - +1 With this fix, pig does not merge the foreach statements if first foreach has a user defined schema. We should consider supporting that case as well (in a separate jira), because users sometimes add a foreach statement just to give convenient names for columns . eg- J = join A by col1, B by col1 F1 = foreach J generate A::col1 as Acol1 , A::col2 as col2, B::col1 as Bcol1; -- foreach that has been added just to give convenient names for expressions. F2 = foreach F1 generate Acol1+col2, Bcol1 + col2 ;
        Hide
        Daniel Dai added a comment -

        test-patch result:

        [exec]
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        All tests pass.

        Show
        Daniel Dai added a comment - test-patch result: [exec] [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. All tests pass.

          People

          • Assignee:
            Daniel Dai
            Reporter:
            Daniel Dai
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development