Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2931

$ signs in the replacement string make parameter substitution fail

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.10.0
    • 0.11
    • None
    • None
    • Hide
      This changes the behavior of parameter substitution so that dollar signs in the replacement string are no longer treated as references to captured subsequences, but they are now treated as a literal replacement string. Therefore, this will cause backward compatibility issues for users who rely on the previous behavior of parameter substitution.
      Show
      This changes the behavior of parameter substitution so that dollar signs in the replacement string are no longer treated as references to captured subsequences, but they are now treated as a literal replacement string. Therefore, this will cause backward compatibility issues for users who rely on the previous behavior of parameter substitution.

    Description

      To reproduce the issue, use the following pig script:

      test.pig
      a = load 'data';
      b = filter by $FILTER;
      

      and run the following command:

      pig -x local -dryrun -f test.pig -p FILTER="(\$0 == 'a')"
      

      This generates the following script:

      test.pig.substituted
      a = load 'data';
      b = filter by ($FILTER == 'a');
      

      However this should be:

      a = load 'data';
      b = filter by ($0 == 'a');
      

      This is because Pig calls replaceFirst() with a replacement string that include a $ sign as follows:

      "$FILTER".replaceFirst("\\$FILTER", "($0 == 'a')"));
      

      To treat $ signs as literals in the replacement string, we must escape them. Please see the Java doc for Matcher class for explanation:

      Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.

      Attachments

        1. PIG-2931.patch
          4 kB
          Cheolsoo Park

        Issue Links

          Activity

            People

              cheolsoo Cheolsoo Park
              cheolsoo Cheolsoo Park
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: