Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11506

Configuration variable expansion regex expensive for long values

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Profiling several large Hadoop jobs, we discovered that a surprising amount of time was spent inside Configuration.get, more specifically, in regex matching caused by the substituteVars call.

      1. HADOOP-11506.001.patch
        4 kB
        Gera Shegalov
      2. HADOOP-11506.002.patch
        6 kB
        Gera Shegalov
      3. HADOOP-11506.003.patch
        9 kB
        Gera Shegalov
      4. HADOOP-11506.004.patch
        9 kB
        Gera Shegalov

        Issue Links

          Activity

          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          Most properties are not subject to variable substitution, and exit in the following code block:

           if (!match.find()) {
          return eval;
          }
          

          Getting there requires creating a matcher, allocating a HashSet, and evaluating the regex:

          private static final Pattern VAR_PATTERN =
          Pattern.compile("\\$\\{[^\\}\\$\u0020]+\\}");
          

          'tis far simpler to bail early and not do expensive regex evaluation in the majority of cases, by adding a simple check:

           if (expr == null) {
          return null;
          }
          if (!expr.contains("$")) {
            return expr;
          }
          

          (The new check is the second if condition above).

          Many users assume that Configuration.get() is a Map lookup, and call it inside map / reduce functions, which adds up to non-trivial overhead when the m/r functions are simple.

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - Most properties are not subject to variable substitution, and exit in the following code block: if (!match.find()) { return eval; } Getting there requires creating a matcher, allocating a HashSet, and evaluating the regex: private static final Pattern VAR_PATTERN = Pattern.compile( "\\$\\{[^\\}\\$\u0020]+\\}" ); 'tis far simpler to bail early and not do expensive regex evaluation in the majority of cases, by adding a simple check: if (expr == null ) { return null ; } if (!expr.contains( "$" )) { return expr; } (The new check is the second if condition above). Many users assume that Configuration.get() is a Map lookup, and call it inside map / reduce functions, which adds up to non-trivial overhead when the m/r functions are simple.
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          (h/t Alex Levenson for the "$" check, I was going to suggest caching checked strings but this is simpler).

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - (h/t Alex Levenson for the "$" check, I was going to suggest caching checked strings but this is simpler).
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          heck, for a couple extra cycles, let's change that a bit...

          if (expr.indexOf('$') == -1) {
            return expr;
          }
          
          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - heck, for a couple extra cycles, let's change that a bit... if (expr.indexOf('$') == -1) { return expr; }
          Hide
          alexlevenson Alex Levenson added a comment -

          not sure which is more rare, $ or

          { or }, but given that $ is in some class names, and values are often classnames, maybe use { or }

          instead. Luckily none of them are in base64 values.

          Show
          alexlevenson Alex Levenson added a comment - not sure which is more rare, $ or { or }, but given that $ is in some class names, and values are often classnames, maybe use { or } instead. Luckily none of them are in base64 values.
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          this is such a good idea I'm gonna edit my previous posts to replace mentions of $ with {

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - this is such a good idea I'm gonna edit my previous posts to replace mentions of $ with {
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          or I would if this version of JIRA let me edit my previous posts.

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - or I would if this version of JIRA let me edit my previous posts.
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          or I would if this version of JIRA let me edit my previous posts.

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - or I would if this version of JIRA let me edit my previous posts.
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          as per Alex's suggestion, we should match on { not on a dollar sign.

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - as per Alex's suggestion, we should match on { not on a dollar sign.
          Hide
          andrew.wang Andrew Wang added a comment -

          Patch here would be most welcome, especially if you're willing to fire up perf and get some pretty numbers

          I flamegraph'd some MR jobs about a year ago and noticed a lot of time spent in Configuration, so I'd love to see this improved.

          Show
          andrew.wang Andrew Wang added a comment - Patch here would be most welcome, especially if you're willing to fire up perf and get some pretty numbers I flamegraph'd some MR jobs about a year ago and noticed a lot of time spent in Configuration, so I'd love to see this improved.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          We will post a patch. We can generalize the suggested improvement by getting rid of regex for subbing because the pattern is quite simple.

          Show
          jira.shegalov Gera Shegalov added a comment - We will post a patch. We can generalize the suggested improvement by getting rid of regex for subbing because the pattern is quite simple.
          Hide
          dvryaboy Dmitriy V. Ryaboy added a comment -

          Andrew Wang no perf, but I did turn on -Xprof and in our case (which involves scalding, chill, and cascading all passing configs around like they are playing hot potato) the pre-patch version had this regex eating up about 16% of the SpillThread.

          Show
          dvryaboy Dmitriy V. Ryaboy added a comment - Andrew Wang no perf, but I did turn on -Xprof and in our case (which involves scalding, chill, and cascading all passing configs around like they are playing hot potato) the pre-patch version had this regex eating up about 16% of the SpillThread.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          v1 with manual regex.

          Show
          jira.shegalov Gera Shegalov added a comment - v1 with manual regex.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12694315/HADOOP-11506.001.patch
          against trunk revision 8f26d5a.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5471//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5471//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694315/HADOOP-11506.001.patch against trunk revision 8f26d5a. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5471//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5471//console This message is automatically generated.
          Hide
          alexlevenson Alex Levenson added a comment -

          Gera Shegalov we could still short circuit + avoid allocating the HashSet by checking for indexOf("{"), what do you think?

          Show
          alexlevenson Alex Levenson added a comment - Gera Shegalov we could still short circuit + avoid allocating the HashSet by checking for indexOf("{"), what do you think?
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Alex Levenson, yes it makes sense. This is a forward-port from 2.4 where the HashSet allocation problem does not exist as you can see in our internal RB.

          Show
          jira.shegalov Gera Shegalov added a comment - Alex Levenson , yes it makes sense. This is a forward-port from 2.4 where the HashSet allocation problem does not exist as you can see in our internal RB.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          I looked more into this HashSet introduced by HADOOP-6871. This implementation intended to prevent circular substitutions is not quite correct for the general case.

          When we have a key k with the value p${k}s where at least one of p or s is a non-empty prefix/suffix. Each time substitution is performed it results in a new value

          1. pp${k}ss
          2. ppp${k}sss
            ...

          Here is 002 with an alternative implementation that simply checks whether value contains a further reference to the replaced variable.

          Some micro benchmark results. A random scalding job I looked at had a 10K-character value, thus testing a string value of this length with different characters injected in the middle:

          import org.apache.commons.lang.StringUtils;
          import org.apache.hadoop.conf.Configuration;
          
          public class HConf {
            public static void main(String[] args) {
              final Configuration conf = new Configuration(false);
              final Integer numIters = Integer.valueOf(args[0]);
              final String injectVar = args.length > 1 ? args[1] : null;
              String testVal = StringUtils.rightPad("", 10000, 'a');
              if (injectVar != null) {
                testVal = testVal.substring(0, testVal.length() / 2)
                    + injectVar
                    + testVal.substring(testVal.length() / 2 + 1, testVal.length());
              }
          
              conf.set("testVar", testVal);
              for (int i = 0; i < numIters; i++) {
                final String val = conf.get("testVar");
              }
            }
          }
          

          I test the following cases:

          1. No ${ in the value:

          $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000
          
          real    1m21.296s
          user    1m20.958s
          sys     0m0.351s
          
          $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000
          
          real    0m8.992s
          user    0m5.877s
          sys     0m0.213s
          

          ~10x improvement

          2. injecting '$'

          $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '$'
          
          real    1m13.073s
          user    1m11.457s
          sys     0m0.320s
          
          $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '$'
          
          real    0m5.746s
          user    0m5.794s
          sys     0m0.192s
          

          3. injecting '{'

          $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '{'
          
          real    1m19.289s
          user    1m19.116s
          sys     0m0.283s
          
          $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '{'
          
          real    0m6.251s
          user    0m6.331s
          sys     0m0.167s
          

          4. Injecting "${"

          $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${'
          
          real    3m13.905s
          user    3m12.911s
          sys     0m0.503s
          
          $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${'
          
          real    0m14.950s
          user    0m14.956s
          sys     0m0.217s
          

          13x improvement

          5. Injecting "${test}"

          $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${test}'
          
          real    0m38.066s
          user    0m38.040s
          sys     0m0.272s
          
          $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${test}'
          
          real    0m3.768s
          user    0m3.769s
          sys     0m0.268s
          

          The problem is less pronounced when there is something to replace but still 10x.

          Show
          jira.shegalov Gera Shegalov added a comment - I looked more into this HashSet introduced by HADOOP-6871 . This implementation intended to prevent circular substitutions is not quite correct for the general case. When we have a key k with the value p${k}s where at least one of p or s is a non-empty prefix/suffix. Each time substitution is performed it results in a new value pp${k}ss ppp${k}sss ... Here is 002 with an alternative implementation that simply checks whether value contains a further reference to the replaced variable. Some micro benchmark results. A random scalding job I looked at had a 10K-character value, thus testing a string value of this length with different characters injected in the middle: import org.apache.commons.lang.StringUtils; import org.apache.hadoop.conf.Configuration; public class HConf { public static void main( String [] args) { final Configuration conf = new Configuration( false ); final Integer numIters = Integer .valueOf(args[0]); final String injectVar = args.length > 1 ? args[1] : null ; String testVal = StringUtils.rightPad("", 10000, 'a'); if (injectVar != null ) { testVal = testVal.substring(0, testVal.length() / 2) + injectVar + testVal.substring(testVal.length() / 2 + 1, testVal.length()); } conf.set( "testVar" , testVal); for ( int i = 0; i < numIters; i++) { final String val = conf.get( "testVar" ); } } } I test the following cases: 1. No ${ in the value: $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 real 1m21.296s user 1m20.958s sys 0m0.351s $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 real 0m8.992s user 0m5.877s sys 0m0.213s ~10x improvement 2. injecting '$' $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '$' real 1m13.073s user 1m11.457s sys 0m0.320s $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '$' real 0m5.746s user 0m5.794s sys 0m0.192s 3. injecting '{' $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '{' real 1m19.289s user 1m19.116s sys 0m0.283s $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '{' real 0m6.251s user 0m6.331s sys 0m0.167s 4. Injecting "${" $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${' real 3m13.905s user 3m12.911s sys 0m0.503s $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${' real 0m14.950s user 0m14.956s sys 0m0.217s 13x improvement 5. Injecting "${test}" $ time ./hadoop-3.0.0-trunk/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${test}' real 0m38.066s user 0m38.040s sys 0m0.272s $ time ./hadoop-3.0.0-HADOOP-11506/bin/hadoop jar testconf-1.0-SNAPSHOT.jar HConf 1000000 '${test}' real 0m3.768s user 0m3.769s sys 0m0.268s The problem is less pronounced when there is something to replace but still 10x.
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12694962/HADOOP-11506.002.patch
          against trunk revision d12dd47.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5517//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5517//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694962/HADOOP-11506.002.patch against trunk revision d12dd47. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5517//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5517//console This message is automatically generated.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Can you review this patch, Andrew Wang?

          Show
          jira.shegalov Gera Shegalov added a comment - Can you review this patch, Andrew Wang ?
          Hide
          andrew.wang Andrew Wang added a comment -

          Happy to, thanks for working on this everyone. Review comments below, only one substantial one. Core logic looks good though.

          • The new recursion detection only handles a two-var loop, but the old one handles bigger loops with the hash set. For instance, FOO1=FOO2, FOO2=FOO3, FOO3=FOO1. The behavior is a little different thus; before, we'd return the start of the loop, here, we'd hit the IllegalStateException at the end. I feel like the exception is arguably better, but compatibility dictates we preserve the existing behavior.

          Nits:

          • Not caused by your patch, but do you mind adding javadoc to substituteVars about the expected input/output? Same for findSubVariable would be nice.
          • Also good to document why we go through this dance rather than using a regex (can mention this JIRA #)
          • Did you use a return parameter for performance reasons? Eden allocation is pretty fast, so just returning seems fine. With all the substring and string concat flying around, we're doing a lot of allocation anyway.

          Spacing is a bit off here:

                eval =   eval.substring(0, dollar)
          

          remove "the"

              // that can occur in the nested class names
          

          I think this should be a right brace rather than left for ultimate clarity:

                int afterRightBrace = varBounds[SUB_END_IDX] + "{".length();
          

          Similar, I think "{c" should be "c}" for ultimate clarity:

                   bracePos > 0 && bracePos + "{c".length() < eval.length();
          
          Show
          andrew.wang Andrew Wang added a comment - Happy to, thanks for working on this everyone. Review comments below, only one substantial one. Core logic looks good though. The new recursion detection only handles a two-var loop, but the old one handles bigger loops with the hash set. For instance, FOO1=FOO2, FOO2=FOO3, FOO3=FOO1 . The behavior is a little different thus; before, we'd return the start of the loop, here, we'd hit the IllegalStateException at the end. I feel like the exception is arguably better, but compatibility dictates we preserve the existing behavior. Nits: Not caused by your patch, but do you mind adding javadoc to substituteVars about the expected input/output? Same for findSubVariable would be nice. Also good to document why we go through this dance rather than using a regex (can mention this JIRA #) Did you use a return parameter for performance reasons? Eden allocation is pretty fast, so just returning seems fine. With all the substring and string concat flying around, we're doing a lot of allocation anyway. Spacing is a bit off here: eval = eval.substring(0, dollar) remove "the" // that can occur in the nested class names I think this should be a right brace rather than left for ultimate clarity: int afterRightBrace = varBounds[SUB_END_IDX] + "{" .length(); Similar, I think "{c" should be "c}" for ultimate clarity: bracePos > 0 && bracePos + "{c" .length() < eval.length();
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Thanks Andrew Wang for the thorough review.

          I added tests for 3-variable resolution loops, and javdoc you requested, fixed whitespaces.

          You are right about "}" in afterRightBrace definition.

          As for

          bracePos > 0 && bracePos + "{c".length() < eval.length();
          

          The intuition is: bracePos points to a left brace '{' of the minimum valid var expression "${c}". Thus, the closing right brace pos of the minimum valid var is bracePos + "{c".length. Adding some more comments.

          Show
          jira.shegalov Gera Shegalov added a comment - Thanks Andrew Wang for the thorough review. I added tests for 3-variable resolution loops, and javdoc you requested, fixed whitespaces. You are right about "}" in afterRightBrace definition. As for bracePos > 0 && bracePos + "{c" .length() < eval.length(); The intuition is: bracePos points to a left brace '{' of the minimum valid var expression "${c}" . Thus, the closing right brace pos of the minimum valid var is bracePos + "{c".length . Adding some more comments.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12695779/HADOOP-11506.003.patch
          against trunk revision ffc75d6.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          -1 release audit. The applied patch generated 1 release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//testReport/
          Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//artifact/patchprocess/patchReleaseAuditProblems.txt
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12695779/HADOOP-11506.003.patch against trunk revision ffc75d6. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. -1 release audit . The applied patch generated 1 release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//artifact/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5549//console This message is automatically generated.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          The release audit warning is tracked by YARN-3113. The patch is reviewable, Andrew Wang.

          Show
          jira.shegalov Gera Shegalov added a comment - The release audit warning is tracked by YARN-3113 . The patch is reviewable, Andrew Wang .
          Hide
          andrew.wang Andrew Wang added a comment -

          +1 thanks Gera

          Show
          andrew.wang Andrew Wang added a comment - +1 thanks Gera
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Thank you, Andrew. I'll wait for a couple of days to see if there are more comments before committing.

          Show
          jira.shegalov Gera Shegalov added a comment - Thank you, Andrew. I'll wait for a couple of days to see if there are more comments before committing.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Minor fix: I had an unintended HashSet size specification:

          diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          index 9380d68..c47e771 100644
          --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          @@ -963,7 +963,7 @@ private String substituteVars(String expr) {
                 final int afterRightBrace = varBounds[SUB_END_IDX] + "}".length();
                 final String refVar = eval.substring(dollar, afterRightBrace);
                 if (evalSet == null) {
          -        evalSet = new HashSet<String>(1);
          +        evalSet = new HashSet<String>();
                 }
                 if (!evalSet.add(refVar)) {
                   return expr; // return original expression if there is a loop
          
          Show
          jira.shegalov Gera Shegalov added a comment - Minor fix: I had an unintended HashSet size specification: diff --git a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java index 9380d68..c47e771 100644 --- a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java +++ b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java @@ -963,7 +963,7 @@ private String substituteVars( String expr) { final int afterRightBrace = varBounds[SUB_END_IDX] + "}" .length(); final String refVar = eval.substring(dollar, afterRightBrace); if (evalSet == null ) { - evalSet = new HashSet< String >(1); + evalSet = new HashSet< String >(); } if (!evalSet.add(refVar)) { return expr; // return original expression if there is a loop
          Hide
          hadoopqa Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12696002/HADOOP-11506.004.patch
          against trunk revision 5f9a0dd.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5559//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5559//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12696002/HADOOP-11506.004.patch against trunk revision 5f9a0dd. +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5559//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5559//console This message is automatically generated.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #7039 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7039/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #7039 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7039/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Committed to trunk and branch-2. Thanks Andrew Wang for reviews. Thanks Dmitriy V. Ryaboy and Alex Levenson for inputs and additional reviews.

          Show
          jira.shegalov Gera Shegalov added a comment - Committed to trunk and branch-2. Thanks Andrew Wang for reviews. Thanks Dmitriy V. Ryaboy and Alex Levenson for inputs and additional reviews.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #96 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/96/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/830/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #830 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/830/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #93 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/93/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java hadoop-common-project/hadoop-common/CHANGES.txt
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #2028 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2028/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #97 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/97/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/)
          HADOOP-11506. Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713)

          • hadoop-common-project/hadoop-common/CHANGES.txt
          • hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          • hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #2047 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2047/ ) HADOOP-11506 . Configuration variable expansion regex expensive for long values. (Gera Shegalov via gera) (gera: rev 644548f201743408904dfe24b9f5b515b2c96713) hadoop-common-project/hadoop-common/CHANGES.txt hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Pulled this into 2.6.1 after Akira Ajisaka verified that the patch applies cleanly. (The 2.7 patch itself departed a bit from trunk). Ran compilation and TestConfiguration before the push.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Pulled this into 2.6.1 after Akira Ajisaka verified that the patch applies cleanly. (The 2.7 patch itself departed a bit from trunk). Ran compilation and TestConfiguration before the push.

            People

            • Assignee:
              jira.shegalov Gera Shegalov
              Reporter:
              dvryaboy Dmitriy V. Ryaboy
            • Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development