Pig
  1. Pig
  2. PIG-598

Parameter substitution ($PARAMETER) should not be performed in comments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.7.0
    • Component/s: impl
    • Labels:
      None

      Description

      Compiling the following code example will generate an error that $NOT_A_PARAMETER is an Undefined Parameter.

      This is problematic as sometimes you want to comment out parts of your code, including parameters so that you don't have to define them.

      This I think it would be really good if parameter substitution was not performed in comments.

      -- $NOT_A_PARAMETER
      
      -bash-3.00$ pig -exectype local -latest comment.pig
      USING: /grid/0/gs/pig/current
      java.lang.RuntimeException: Undefined parameter : NOT_A_PARAMETER
              at org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:221)
              at org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106)
              at org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86)
              at org.apache.pig.Main.runParamPreprocessor(Main.java:394)
              at org.apache.pig.Main.main(Main.java:296)
      
      1. PIG-598.1.patch
        27 kB
        Thejas M Nair
      2. PIG-598.patch
        24 kB
        Thejas M Nair

        Issue Links

          Activity

          Hide
          George Mavromatis added a comment -

          This has been flagged as minor, I think that's a rather serious and elementary one. It takes out the ability to comment/uncomment parameters and creates very bizarre error messages to people who are not aware of it. Can we increase priority?

          Show
          George Mavromatis added a comment - This has been flagged as minor, I think that's a rather serious and elementary one. It takes out the ability to comment/uncomment parameters and creates very bizarre error messages to people who are not aware of it. Can we increase priority?
          Hide
          Thejas M Nair added a comment -

          With this patch

          • Parameters in comments are no longer substituted
          • Line numbers don't change after parameter substitution, as long as declare/default don't span multiple lines (no multi-line literals etc). This will help in giving accurate line numbers in error messages after parameter substitution.

          Code changes:
          Instead of the parser processing the pig script a line at a time, the whole script is processed at once.
          There are changes in test cases, because original line numbers are now preserved.

          Show
          Thejas M Nair added a comment - With this patch Parameters in comments are no longer substituted Line numbers don't change after parameter substitution, as long as declare/default don't span multiple lines (no multi-line literals etc). This will help in giving accurate line numbers in error messages after parameter substitution. Code changes: Instead of the parser processing the pig script a line at a time, the whole script is processed at once. There are changes in test cases, because original line numbers are now preserved.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12422928/PIG-598.patch
          against trunk revision 828891.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 48 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/111/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12422928/PIG-598.patch against trunk revision 828891. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/111/console This message is automatically generated.
          Hide
          Thejas M Nair added a comment -

          New patch generated after updating svn.

          Show
          Thejas M Nair added a comment - New patch generated after updating svn.
          Hide
          Thejas M Nair added a comment -

          Re-submitting patch for hudson.

          Show
          Thejas M Nair added a comment - Re-submitting patch for hudson.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12423034/PIG-598.patch
          against trunk revision 830757.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 48 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 197 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 318 release audit warnings (more than the trunk's current 313 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12423034/PIG-598.patch against trunk revision 830757. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 197 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 318 release audit warnings (more than the trunk's current 313 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/125/console This message is automatically generated.
          Hide
          Thejas M Nair added a comment -

          -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 197 warnings).

          The additional warnings are from code generated by javacc, which cannot be fixed in the .jj files.

          -1 release audit. The applied patch generated 318 release audit warnings (more than the trunk's current 313 warnings).

          The release audit warnings are from new test input and benchmark files, because they don't have the apache license header.

          Show
          Thejas M Nair added a comment - -1 javac. The applied patch generated 199 javac compiler warnings (more than the trunk's current 197 warnings). The additional warnings are from code generated by javacc, which cannot be fixed in the .jj files. -1 release audit. The applied patch generated 318 release audit warnings (more than the trunk's current 313 warnings). The release audit warnings are from new test input and benchmark files, because they don't have the apache license header.
          Hide
          Olga Natkovich added a comment -

          The code looks reasonable.

          When I tried to run tests with comments (both kind) in the middle of the line - they are not recognized

          Show
          Olga Natkovich added a comment - The code looks reasonable. When I tried to run tests with comments (both kind) in the middle of the line - they are not recognized
          Hide
          Ashutosh Chauhan added a comment -

          One issue I faced while working on PIG-928 was when trying to name variables in ruby bound to java variables. Ruby variable names (in BSF) needs to be prepended with $ as shown below.

          define myudf org.apache.pig.scriptedUDFs.GenericEvalFunc('wordtokens','return $input.split();');
          

          Now this $ appearing in the script should not be substituted. But parser tries to substitute it and fails (with error undefined parameter) since param is not specified while invoking Pig. My hack was to 'escape' this $ and then provide param-substiution while invoking the script as -p input=input

          define myudf org.apache.pig.scriptedUDFs.GenericEvalFunc('wordtokens','return $$input.split();');
          

          Note extra $. Obviously, if we are thinking of adding pig-928 and accept udfs written in ruby, this will come up as an issue. One possibility is Pig should not try to do substitution if param is not specified on cmd line. I think way it currently is parser scans the script for $ and then tries to do substiution. If substitution was not defined it error outs. I think if it is not specified on cmd line, then parser should ignore it and continue instead of failing.

          Show
          Ashutosh Chauhan added a comment - One issue I faced while working on PIG-928 was when trying to name variables in ruby bound to java variables. Ruby variable names (in BSF) needs to be prepended with $ as shown below. define myudf org.apache.pig.scriptedUDFs.GenericEvalFunc('wordtokens',' return $input.split();'); Now this $ appearing in the script should not be substituted. But parser tries to substitute it and fails (with error undefined parameter) since param is not specified while invoking Pig. My hack was to 'escape' this $ and then provide param-substiution while invoking the script as -p input=input define myudf org.apache.pig.scriptedUDFs.GenericEvalFunc('wordtokens',' return $$input.split();'); Note extra $. Obviously, if we are thinking of adding pig-928 and accept udfs written in ruby, this will come up as an issue. One possibility is Pig should not try to do substitution if param is not specified on cmd line. I think way it currently is parser scans the script for $ and then tries to do substiution. If substitution was not defined it error outs. I think if it is not specified on cmd line, then parser should ignore it and continue instead of failing.
          Hide
          Thejas M Nair added a comment -

          One issue I faced while working on PIG-928 was when trying to name variables in ruby bound to java variables.

          Ashutosh,
          You can use "\" to escape parameter substitution . Use 'return \$input.split();' instead of 'return $input.split();' . After parameter substitution, it becomes 'return $input.split();' .

          Show
          Thejas M Nair added a comment - One issue I faced while working on PIG-928 was when trying to name variables in ruby bound to java variables. Ashutosh, You can use "\" to escape parameter substitution . Use 'return \$input.split();' instead of 'return $input.split();' . After parameter substitution, it becomes 'return $input.split();' .
          Hide
          Thejas M Nair added a comment -

          Additional changes in this patch-

          • Fixed parsing in PigFileParser.jj
          • Modified test input file for testCommentWithParam() - inputComment.pig, to include comments within and at end of statements
          Show
          Thejas M Nair added a comment - Additional changes in this patch- Fixed parsing in PigFileParser.jj Modified test input file for testCommentWithParam() - inputComment.pig, to include comments within and at end of statements
          Hide
          Ashutosh Chauhan added a comment -

          I guess my question is what should be the behavior when $ is specified in the script and no substitution for it is provided. There are two options: a) If Pig encounters a $ and doesn't find a substitution for it, it fails right there.
          b) Pig logs a warning message and continue assuming user wants literal $ and not the substitution.

          Advantage for b) is there will not be a need of escaping. Disadvantage is when no substitution was unintentional, Pig will fail later, possibly with a different error message.
          Disadvantage of a) is it mandates user to escape $, where its possible not to have such requirement. Advantage is a clear error message can be thrown if no substitution was unintentional.

          What do you think which option shall we choose?

          Show
          Ashutosh Chauhan added a comment - I guess my question is what should be the behavior when $ is specified in the script and no substitution for it is provided. There are two options: a) If Pig encounters a $ and doesn't find a substitution for it, it fails right there. b) Pig logs a warning message and continue assuming user wants literal $ and not the substitution. Advantage for b) is there will not be a need of escaping. Disadvantage is when no substitution was unintentional, Pig will fail later, possibly with a different error message. Disadvantage of a) is it mandates user to escape $, where its possible not to have such requirement. Advantage is a clear error message can be thrown if no substitution was unintentional. What do you think which option shall we choose?
          Hide
          Thejas M Nair added a comment -

          I prefer option a. It is just a matter of putting a "\" before the $ in the scripts
          I think compared to the cost of time spending debugging a weird error or unexpected output results, the cost of a for the user is trivial.

          Ideally, I think we should support an option where user can change from default behavior (a) to (b) using a commandline switch or a statement in the script.

          Show
          Thejas M Nair added a comment - I prefer option a. It is just a matter of putting a "\" before the $ in the scripts I think compared to the cost of time spending debugging a weird error or unexpected output results, the cost of a for the user is trivial. Ideally, I think we should support an option where user can change from default behavior (a) to (b) using a commandline switch or a statement in the script.
          Hide
          Ashutosh Chauhan added a comment -

          compared to the cost of time spending debugging a weird error or unexpected output results, the cost of a for the user is trivial.

          I agree. Catching errors early and providing clear explanation for it is more important then the convenience of not requiring escaping. Thus a) is better.

          we should support an option where user can change from default behavior (a) to (b) using a commandline switch or a statement in the script.

          Adding yet another hook may not be worth the complexity/time. Lets keep this on hold unless someone specifically asks for this.

          Show
          Ashutosh Chauhan added a comment - compared to the cost of time spending debugging a weird error or unexpected output results, the cost of a for the user is trivial. I agree. Catching errors early and providing clear explanation for it is more important then the convenience of not requiring escaping. Thus a) is better. we should support an option where user can change from default behavior (a) to (b) using a commandline switch or a statement in the script. Adding yet another hook may not be worth the complexity/time. Lets keep this on hold unless someone specifically asks for this.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12425862/PIG-598.1.patch
          against trunk revision 882818.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 48 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          -1 javac. The applied patch generated 213 javac compiler warnings (more than the trunk's current 211 warnings).

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          -1 release audit. The applied patch generated 361 release audit warnings (more than the trunk's current 356 warnings).

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/testReport/
          Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425862/PIG-598.1.patch against trunk revision 882818. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 48 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 213 javac compiler warnings (more than the trunk's current 211 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 361 release audit warnings (more than the trunk's current 356 warnings). +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/52/console This message is automatically generated.
          Hide
          Thejas M Nair added a comment -

          -1 javac. The applied patch generated 213 javac compiler warnings (more than the trunk's current 211 warnings).

          The additional warnings are from code generated by javacc, which cannot be fixed in the .jj files.

          -1 release audit. The applied patch generated 361 release audit warnings (more than the trunk's current 356 warnings).

          The release audit warnings are from new test input and benchmark files, because they don't have the apache license header.

          Show
          Thejas M Nair added a comment - -1 javac. The applied patch generated 213 javac compiler warnings (more than the trunk's current 211 warnings). The additional warnings are from code generated by javacc, which cannot be fixed in the .jj files. -1 release audit. The applied patch generated 361 release audit warnings (more than the trunk's current 356 warnings). The release audit warnings are from new test input and benchmark files, because they don't have the apache license header.
          Hide
          Olga Natkovich added a comment -

          patch committed. Thanks, Thejas for contributing - this change has been long overdue!

          Show
          Olga Natkovich added a comment - patch committed. Thanks, Thejas for contributing - this change has been long overdue!

            People

            • Assignee:
              Thejas M Nair
              Reporter:
              David Ciemiewicz
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development