Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.1.2
-
None
Description
SystemVariables#substitute() is dealing with circular references between variables by only doing the substitution 40 times by default. If the substituted part is sufficiently large though, it's possible that the substitution will produce a string bigger than the heap size within the 40 executions.
Take the following test case that fails with OOM in current master (third round of execution would need 10G heap, while running with only 2G):
@Test public void testSubstitute() { String randomPart = RandomStringUtils.random(100_000); String reference = "${hiveconf:myTestVariable}"; StringBuilder longStringWithReferences = new StringBuilder(); for(int i = 0; i < 10; i ++) { longStringWithReferences.append(randomPart).append(reference); } SystemVariables uut = new SystemVariables(); HiveConf conf = new HiveConf(); conf.set("myTestVariable", longStringWithReferences.toString()); uut.substitute(conf, longStringWithReferences.toString(), 40); }
Produces:
java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448) at java.lang.StringBuilder.append(StringBuilder.java:136) at org.apache.hadoop.hive.conf.SystemVariables.substitute(SystemVariables.java:110) at org.apache.hadoop.hive.conf.SystemVariablesTest.testSubstitute(SystemVariablesTest.java:27)
We should check the size of the substituted query and bail out earlier.