Pig
  1. Pig
  2. PIG-3239

Unable to return multiple values from a macro using SPLIT

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.10.0
    • Fix Version/s: 0.12.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      Apache Pig version 0.10.0-cdh4.2.0 (rexported)
      compiled Feb 15 2013, 12:19:17

      Linux 3.2.0-38-generic #61-Ubuntu SMP Tue Feb 19 12:18:21 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

      Description

      Hi, I'm unable to return multiple values from a macro when values come from a SPLIT. Here is an small example:

      DEFINE my_macro(seq) RETURNS valid, invalid {
          added = FOREACH $seq GENERATE $0 * 2, $1;
          SPLIT added INTO $valid IF $1 == true, $invalid OTHERWISE;
      }
      
      data = LOAD 'case.csv' USING PigStorage(',') AS (value: int, valid: boolean);
      P, Q = my_macro(data);
      DUMP P;
      DUMP Q;
      

      Pig is unable to recognize the OTHERWISE side. Error is: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <at case.pig, line 3> Invalid macro definition: . Reason: Macro 'my_macro' missing return alias: invalid

      Simple workaround is to force $invalid to be returned as FOREACH result:

      SPLIT added INTO $valid IF $1 == true, tmp_invalid OTHERWISE;
      $invalid = FOREACH tmp_invalid GENERATE *;
      

      Samples and logs attached to the issue.

      1. PIG-3239.patch.txt
        0.7 kB
        Johnny Zhang
      2. PIG-3239.patch.txt
        2 kB
        Johnny Zhang

        Activity

        Show
        Luis Belloch added a comment - Files: https://gist.github.com/luisbelloch/5099190
        Hide
        Johnny Zhang added a comment -

        Luis Belloch, please correct me if I am wrong, I think 'SPLIT' doesn't work with 'OTHERWISE'. 'SPLIT' only work with 'IF' http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#SPLIT

        actually below script works with me very well on your example

        {noforamt}

        DEFINE my_macro(seq) RETURNS valid, invalid

        { added = FOREACH $seq GENERATE $0 * 2, $1; SPLIT added INTO $valid IF $1 == true, $invalid IF $1 ==false; }

        data = LOAD 'case.csv' USING PigStorage(',') AS (value: int, valid: boolean);
        P, Q = my_macro(data);
        DUMP P;
        DUMP Q;

        
        
        Show
        Johnny Zhang added a comment - Luis Belloch , please correct me if I am wrong, I think 'SPLIT' doesn't work with 'OTHERWISE'. 'SPLIT' only work with 'IF' http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#SPLIT actually below script works with me very well on your example {noforamt} DEFINE my_macro(seq) RETURNS valid, invalid { added = FOREACH $seq GENERATE $0 * 2, $1; SPLIT added INTO $valid IF $1 == true, $invalid IF $1 ==false; } data = LOAD 'case.csv' USING PigStorage(',') AS (value: int, valid: boolean); P, Q = my_macro(data); DUMP P; DUMP Q;
        Hide
        Johnny Zhang added a comment -

        sorry for the typo, I mean

        DEFINE my_macro(seq) RETURNS valid, invalid
        { added = FOREACH $seq GENERATE $0 * 2, $1; SPLIT added INTO $valid IF $1 == true, $invalid IF $1 ==false; } data = LOAD 'case.csv' USING PigStorage(',') AS (value: int, valid: boolean); P, Q = my_macro(data); DUMP P; DUMP Q; 
        
        Show
        Johnny Zhang added a comment - sorry for the typo, I mean DEFINE my_macro(seq) RETURNS valid, invalid { added = FOREACH $seq GENERATE $0 * 2, $1; SPLIT added INTO $valid IF $1 == true, $invalid IF $1 ==false; } data = LOAD 'case.csv' USING PigStorage(',') AS (value: int, valid: boolean); P, Q = my_macro(data); DUMP P; DUMP Q;
        Hide
        Luis Belloch added a comment -

        Hi Johnny Zhang, if you have a look at version 0.10, OTHERWISE keyword exists.
        http://pig.apache.org/docs/r0.10.0/basic.html#SPLIT

        Show
        Luis Belloch added a comment - Hi Johnny Zhang, if you have a look at version 0.10, OTHERWISE keyword exists. http://pig.apache.org/docs/r0.10.0/basic.html#SPLIT
        Hide
        Johnny Zhang added a comment -

        Luis Belloch, you are right, sorry I was looking at the old release doc.

        Show
        Johnny Zhang added a comment - Luis Belloch , you are right, sorry I was looking at the old release doc.
        Hide
        Johnny Zhang added a comment -

        this fix works for me. I will run all unit tests see if any regression it brings.

        Show
        Johnny Zhang added a comment - this fix works for me. I will run all unit tests see if any regression it brings.
        Hide
        Luis Belloch added a comment -

        Thanks! We'll test it internally.

        Show
        Luis Belloch added a comment - Thanks! We'll test it internally.
        Hide
        Cheolsoo Park added a comment -

        Johnny Zhang, thank you for the fix. I think your fix is correct. Can you please add a unit test case for this?

        TestMacroExpansion.java has splitTest, but that doesn't cover OTHERWISE. You might want to expand that test case, or add a new test case.

        Thanks!

        Show
        Cheolsoo Park added a comment - Johnny Zhang , thank you for the fix. I think your fix is correct. Can you please add a unit test case for this? TestMacroExpansion.java has splitTest, but that doesn't cover OTHERWISE. You might want to expand that test case, or add a new test case. Thanks!
        Hide
        Cheolsoo Park added a comment -

        Chatted with Johnny, and it looks like TestMacroExpansion isn't a good place to put a new test case for this. It might be better to add one to TestSplit.

        Show
        Cheolsoo Park added a comment - Chatted with Johnny, and it looks like TestMacroExpansion isn't a good place to put a new test case for this. It might be better to add one to TestSplit.
        Hide
        Johnny Zhang added a comment -

        Cheolsoo Park, thanks a lot for review my patch. I have add one test case into TestSplit.

        Show
        Johnny Zhang added a comment - Cheolsoo Park , thanks a lot for review my patch. I have add one test case into TestSplit.
        Hide
        Cheolsoo Park added a comment -

        +1. Let me run unit tests.

        Show
        Cheolsoo Park added a comment - +1. Let me run unit tests.
        Hide
        Cheolsoo Park added a comment -

        Committed to trunk. Thanks Johnny!

        Show
        Cheolsoo Park added a comment - Committed to trunk. Thanks Johnny!

          People

          • Assignee:
            Johnny Zhang
            Reporter:
            Luis Belloch
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development