Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14216

Improve Configuration XML Parsing Performance

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: None
    • Labels:
      None

      Description

      JIRA is to improve XML parsing performance through reuse and a change in XML parser (STAX)

      1. HADOOP-14216.1.patch
        22 kB
        Jonathan Eagles
      2. HADOOP-14216.2.patch
        21 kB
        Jonathan Eagles
      3. HADOOP-14216.2-branch-2.patch
        21 kB
        Jonathan Eagles
      4. HADOOP-14216.addendum.1.patch
        2 kB
        Jonathan Eagles

        Issue Links

          Activity

          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 21s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
          0 mvndep 0m 13s Maven dependency ordering for branch
          +1 mvninstall 12m 39s trunk passed
          +1 compile 19m 52s trunk passed
          +1 checkstyle 1m 57s trunk passed
          +1 mvnsite 1m 19s trunk passed
          +1 mvneclipse 0m 37s trunk passed
          0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-project
          +1 findbugs 1m 26s trunk passed
          +1 javadoc 1m 11s trunk passed
          0 mvndep 0m 19s Maven dependency ordering for patch
          +1 mvninstall 0m 49s the patch passed
          +1 compile 16m 44s the patch passed
          +1 javac 16m 44s the patch passed
          -0 checkstyle 2m 11s root: The patch generated 8 new + 267 unchanged - 21 fixed = 275 total (was 288)
          +1 mvnsite 1m 31s the patch passed
          +1 mvneclipse 0m 43s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 xml 0m 3s The patch has no ill-formed XML file.
          0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-project
          +1 findbugs 1m 39s the patch passed
          +1 javadoc 1m 18s the patch passed
          +1 unit 0m 19s hadoop-project in the patch passed.
          -1 unit 9m 1s hadoop-common in the patch failed.
          +1 asflicense 0m 37s The patch does not generate ASF License warnings.
          99m 13s



          Reason Tests
          Failed junit tests hadoop.security.TestKDiag
            hadoop.net.TestDNS



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:a9ad5d6
          JIRA Issue HADOOP-14216
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12860071/HADOOP-14216.1.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit xml findbugs checkstyle
          uname Linux 8cfeaed53a10 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / f462e1f
          Default Java 1.8.0_121
          findbugs v3.0.0
          checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/artifact/patchprocess/diff-checkstyle-root.txt
          unit https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/testReport/
          modules C: hadoop-project hadoop-common-project/hadoop-common U: .
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/console
          Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 21s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 13s Maven dependency ordering for branch +1 mvninstall 12m 39s trunk passed +1 compile 19m 52s trunk passed +1 checkstyle 1m 57s trunk passed +1 mvnsite 1m 19s trunk passed +1 mvneclipse 0m 37s trunk passed 0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-project +1 findbugs 1m 26s trunk passed +1 javadoc 1m 11s trunk passed 0 mvndep 0m 19s Maven dependency ordering for patch +1 mvninstall 0m 49s the patch passed +1 compile 16m 44s the patch passed +1 javac 16m 44s the patch passed -0 checkstyle 2m 11s root: The patch generated 8 new + 267 unchanged - 21 fixed = 275 total (was 288) +1 mvnsite 1m 31s the patch passed +1 mvneclipse 0m 43s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 xml 0m 3s The patch has no ill-formed XML file. 0 findbugs 0m 0s Skipped patched modules with no Java source: hadoop-project +1 findbugs 1m 39s the patch passed +1 javadoc 1m 18s the patch passed +1 unit 0m 19s hadoop-project in the patch passed. -1 unit 9m 1s hadoop-common in the patch failed. +1 asflicense 0m 37s The patch does not generate ASF License warnings. 99m 13s Reason Tests Failed junit tests hadoop.security.TestKDiag   hadoop.net.TestDNS Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HADOOP-14216 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12860071/HADOOP-14216.1.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit xml findbugs checkstyle uname Linux 8cfeaed53a10 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f462e1f Default Java 1.8.0_121 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/artifact/patchprocess/diff-checkstyle-root.txt unit https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/testReport/ modules C: hadoop-project hadoop-common-project/hadoop-common U: . Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/11891/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          chris.douglas Chris Douglas added a comment - - edited

          I only skimmed the patch, but this looks good. Just a few questions:

          • I have no trouble believing this shows up in perf traces, but out of curiosity, do you have any data on the improvement this effects?
          • Filed HADOOP-14225; can we use the dependency added here to replace xmlenc?
          • Can you update/add/link docs on the xi:fallback and xi:include semantics? How do these interact with final config values?
          Show
          chris.douglas Chris Douglas added a comment - - edited I only skimmed the patch, but this looks good. Just a few questions: I have no trouble believing this shows up in perf traces, but out of curiosity, do you have any data on the improvement this effects? Filed HADOOP-14225 ; can we use the dependency added here to replace xmlenc? Can you update/add/link docs on the xi:fallback and xi:include semantics? How do these interact with final config values?
          Hide
          jeagles Jonathan Eagles added a comment -

          Client-Side Performance Tests:

          Setup: Essentially run normal user commands and see the performance gains with only the client hadoop-common.jar replaced with a patch version

          Eyeball test:
          1. hadoop fs -ls

          # baseline - ran dozens of times, this is a typical results
          $ time hadoop fs -ls /
          real	0m2.694s
          user	0m6.633s
          sys	0m0.303s
          
          # patched version - ran dozens of times, this is a typical result
          $ time HADOOP_USER_CLASSPATH_FIRST=true HADOOP_CLASSPATH="./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar" hadoop fs -ls /
          real	0m2.335s
          user	0m4.963s
          sys	0m0.292s
          

          ===========================
          Result on a real cluster is roughly 300 ms real 1700 ms user faster per hadoop fs -ls command

          2. yarn application -list

          $ time yarn application -list
          real	0m1.867s
          user	0m5.178s
          sys	0m0.288s
          
          $ time YARN_USER_CLASSPATH="./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar" YARN_USER_CLASSPATH_FIRST=true yarn application -list
          real	0m1.607s
          user	0m3.911s
          sys	0m0.225s
          

          ===========================
          Result on a real cluster is roughly 250ms real and 1200 user faster per yarn application -list command

          Performance Numbers at scale

          ConfPerf.java
          import org.apache.hadoop.conf.Configuration;
          
          public class ConfPerf {
            public static void main(String[] args) throws Exception {
              long start = System.currentTimeMillis();
              long count = 0;
              Configuration.addDefaultResource("core-default.xml");
              Configuration.addDefaultResource("core-site.xml");
              Configuration.addDefaultResource("yarn-default.xml");
              Configuration.addDefaultResource("yarn-site.xml");
              Configuration.addDefaultResource("mapred-default.xml");
              Configuration.addDefaultResource("mapred-site.xml");
              Configuration.addDefaultResource("hdfs-default.xml");
              Configuration.addDefaultResource("hdfs-site.xml");
              for (int i = 0; i < 3000; i++) {
                Configuration conf = new Configuration();
                conf.get("trigger.loading");
                count += conf.size();
              }
              long end = System.currentTimeMillis();
              System.out.println("duration: " + (end - start) + " count: " + count);
            }
          }
          
          # setup performance tests
          $ javac -cp ./:`hadoop classpath` ConfPerf.java
          
          # baseline performance numbers
          $ time java -cp ./:`hadoop classpath` ConfPerf
          real	0m52.456s
          user	1m2.209s
          sys	0m3.601s
          
          # performance numbers with patch
          $ time java -cp ./:./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar:`hadoop classpath` ConfPerf
          real	0m23.108s
          user	0m27.434s
          sys	0m1.816s
          

          ===========================
          Result in a real cluster are roughly 29300 ms real and 34800 ms user faster

          Equality Test

          ConfEquality.java
          import org.apache.hadoop.conf.Configuration;
          
          public class ConfEquality {
            public static void main(String[] args) throws Exception {
              Configuration.addDefaultResource("core-default.xml");
              Configuration.addDefaultResource("core-site.xml");
              Configuration.addDefaultResource("yarn-default.xml");
              Configuration.addDefaultResource("yarn-site.xml");
              Configuration.addDefaultResource("mapred-default.xml");
              Configuration.addDefaultResource("mapred-site.xml");
              Configuration.addDefaultResource("hdfs-default.xml");
              Configuration.addDefaultResource("hdfs-site.xml");
              Configuration conf = new Configuration();
              conf.get("trigger.loading");
              conf.writeXml(System.out);
            }
          }
          
          # prepare the equality test
          $ javac -cp ./:`hadoop classpath` ConfEquality.java
          # run the equality test
          $ diff <(java -cp ./:`hadoop classpath` ConfEquality) <(java -cp ./:./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar:`hadoop classpath` ConfEquality)
          
          Show
          jeagles Jonathan Eagles added a comment - Client-Side Performance Tests: Setup: Essentially run normal user commands and see the performance gains with only the client hadoop-common.jar replaced with a patch version Eyeball test : 1. hadoop fs -ls # baseline - ran dozens of times, this is a typical results $ time hadoop fs -ls / real 0m2.694s user 0m6.633s sys 0m0.303s # patched version - ran dozens of times, this is a typical result $ time HADOOP_USER_CLASSPATH_FIRST= true HADOOP_CLASSPATH= "./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar" hadoop fs -ls / real 0m2.335s user 0m4.963s sys 0m0.292s =========================== Result on a real cluster is roughly 300 ms real 1700 ms user faster per hadoop fs -ls command 2. yarn application -list $ time yarn application -list real 0m1.867s user 0m5.178s sys 0m0.288s $ time YARN_USER_CLASSPATH= "./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar" YARN_USER_CLASSPATH_FIRST= true yarn application -list real 0m1.607s user 0m3.911s sys 0m0.225s =========================== Result on a real cluster is roughly 250ms real and 1200 user faster per yarn application -list command Performance Numbers at scale ConfPerf.java import org.apache.hadoop.conf.Configuration; public class ConfPerf { public static void main( String [] args) throws Exception { long start = System .currentTimeMillis(); long count = 0; Configuration.addDefaultResource( "core- default .xml" ); Configuration.addDefaultResource( "core-site.xml" ); Configuration.addDefaultResource( "yarn- default .xml" ); Configuration.addDefaultResource( "yarn-site.xml" ); Configuration.addDefaultResource( "mapred- default .xml" ); Configuration.addDefaultResource( "mapred-site.xml" ); Configuration.addDefaultResource( "hdfs- default .xml" ); Configuration.addDefaultResource( "hdfs-site.xml" ); for ( int i = 0; i < 3000; i++) { Configuration conf = new Configuration(); conf.get( "trigger.loading" ); count += conf.size(); } long end = System .currentTimeMillis(); System .out.println( "duration: " + (end - start) + " count: " + count); } } # setup performance tests $ javac -cp ./:`hadoop classpath` ConfPerf.java # baseline performance numbers $ time java -cp ./:`hadoop classpath` ConfPerf real 0m52.456s user 1m2.209s sys 0m3.601s # performance numbers with patch $ time java -cp ./:./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar:`hadoop classpath` ConfPerf real 0m23.108s user 0m27.434s sys 0m1.816s =========================== Result in a real cluster are roughly 29300 ms real and 34800 ms user faster Equality Test ConfEquality.java import org.apache.hadoop.conf.Configuration; public class ConfEquality { public static void main( String [] args) throws Exception { Configuration.addDefaultResource( "core- default .xml" ); Configuration.addDefaultResource( "core-site.xml" ); Configuration.addDefaultResource( "yarn- default .xml" ); Configuration.addDefaultResource( "yarn-site.xml" ); Configuration.addDefaultResource( "mapred- default .xml" ); Configuration.addDefaultResource( "mapred-site.xml" ); Configuration.addDefaultResource( "hdfs- default .xml" ); Configuration.addDefaultResource( "hdfs-site.xml" ); Configuration conf = new Configuration(); conf.get( "trigger.loading" ); conf.writeXml( System .out); } } # prepare the equality test $ javac -cp ./:`hadoop classpath` ConfEquality.java # run the equality test $ diff <(java -cp ./:`hadoop classpath` ConfEquality) <(java -cp ./:./hadoop-common-2.8.1-HADOOP-14216.jar:./stax2-api-3.1.4.jar:./aalto-xml-1.0.0.jar:`hadoop classpath` ConfEquality)
          Hide
          jeagles Jonathan Eagles added a comment -

          Regarding xi:include xi:fallback, this was never a documented feature that I could find and no tests proved continued support. Through using the underlying xerces2 xml parser, the xml extension https://www.w3.org/TR/xinclude/ was however available and examples were found to be using this feature. Stax parser, by design or maturity, don't support Xinclude. It was mostly trivial to implement. The essential feature is that xi:include statements xml inine statements. Thus, the source remains the same. xi:fallback statements are not used unless the xi:include fetch fails. Finals in xi:include statements preclude being overridden in the remaining document, but not final configuration can be overridden in the remaining document. If a fetch fails on an xi:include statement and no fallback is found, then that is a fatal error.

          Show
          jeagles Jonathan Eagles added a comment - Regarding xi:include xi:fallback, this was never a documented feature that I could find and no tests proved continued support. Through using the underlying xerces2 xml parser, the xml extension https://www.w3.org/TR/xinclude/ was however available and examples were found to be using this feature. Stax parser, by design or maturity, don't support Xinclude. It was mostly trivial to implement. The essential feature is that xi:include statements xml inine statements. Thus, the source remains the same. xi:fallback statements are not used unless the xi:include fetch fails. Finals in xi:include statements preclude being overridden in the remaining document, but not final configuration can be overridden in the remaining document. If a fetch fails on an xi:include statement and no fallback is found, then that is a fatal error.
          Hide
          chris.douglas Chris Douglas added a comment -

          +1 lgtm

          Show
          chris.douglas Chris Douglas added a comment - +1 lgtm
          Hide
          jeagles Jonathan Eagles added a comment -

          Is this a candidate for other release lines (branch-2, for example)? In that case I can prepare a separate patch.

          Show
          jeagles Jonathan Eagles added a comment - Is this a candidate for other release lines (branch-2, for example)? In that case I can prepare a separate patch.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          Yes this looks like a good candidate for 2.x. Thanks.

          Show
          arpitagarwal Arpit Agarwal added a comment - Yes this looks like a good candidate for 2.x. Thanks.
          Hide
          jeagles Jonathan Eagles added a comment -

          Rebased patch post HADOOP-14213 and provided a patch for branch-2.

          Show
          jeagles Jonathan Eagles added a comment - Rebased patch post HADOOP-14213 and provided a patch for branch-2.
          Hide
          jeagles Jonathan Eagles added a comment -

          I'll commit this to trunk and branch-2 tonight giving everyone else a few more hours to comment.

          Show
          jeagles Jonathan Eagles added a comment - I'll commit this to trunk and branch-2 tonight giving everyone else a few more hours to comment.
          Hide
          jeagles Jonathan Eagles added a comment -

          Thanks to everyone (Chris Douglas, Arpit Agarwal) who helped to resolve this patch. Committed to trunk and branch-2

          Show
          jeagles Jonathan Eagles added a comment - Thanks to everyone ( Chris Douglas , Arpit Agarwal ) who helped to resolve this patch. Committed to trunk and branch-2
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11488 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11488/)
          HADOOP-14216. Improve Configuration XML Parsing Performance (jeagles) (jeagles: rev 523f467d939d80e2bc162e1f47be497109783061)

          • (edit) hadoop-project/pom.xml
          • (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          • (edit) hadoop-common-project/hadoop-common/pom.xml
          • (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11488 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11488/ ) HADOOP-14216 . Improve Configuration XML Parsing Performance (jeagles) (jeagles: rev 523f467d939d80e2bc162e1f47be497109783061) (edit) hadoop-project/pom.xml (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java (edit) hadoop-common-project/hadoop-common/pom.xml (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestConfiguration.java
          Hide
          ajisakaa Akira Ajisaka added a comment -

          This broke many tests in hadoop-tools which rely on Xinclude feature.

          hadoop-tools/hadoop-aws/src/test/resources/core-site.xml
            <!--
            To run these tests.
          
            # Create a file auth-keys.xml  - DO NOT ADD TO REVISION CONTROL
            # add the property test.fs.s3n.name to point to an S3 filesystem URL
            # Add the credentials for the service you are testing against
            -->
            <include xmlns="http://www.w3.org/2001/XInclude" href="auth-keys.xml">
              <fallback/>
            </include>
          
          Show
          ajisakaa Akira Ajisaka added a comment - This broke many tests in hadoop-tools which rely on Xinclude feature. hadoop-tools/hadoop-aws/src/test/resources/core-site.xml <!-- To run these tests. # Create a file auth-keys.xml - DO NOT ADD TO REVISION CONTROL # add the property test.fs.s3n.name to point to an S3 filesystem URL # Add the credentials for the service you are testing against --> <include xmlns= "http: //www.w3.org/2001/XInclude" href= "auth-keys.xml" > <fallback/> </include>
          Hide
          jeagles Jonathan Eagles added a comment -

          Akira Ajisaka, let me take a quick look. We can revert if the change is going to be too great.

          Show
          jeagles Jonathan Eagles added a comment - Akira Ajisaka , let me take a quick look. We can revert if the change is going to be too great.
          Hide
          jeagles Jonathan Eagles added a comment -

          Akira Ajisaka, I am not able to test s3 aws tests yet. Can you test while I get the setup on my side.

          Show
          jeagles Jonathan Eagles added a comment - Akira Ajisaka , I am not able to test s3 aws tests yet. Can you test while I get the setup on my side.
          Hide
          ajisakaa Akira Ajisaka added a comment -

          The addendum patch LGTM, +1. I didn't fully tested yet but confirmed that the settings in the auth-keys.xml are included in Configuration instance. Thanks Jonathan Eagles.

          Show
          ajisakaa Akira Ajisaka added a comment - The addendum patch LGTM, +1. I didn't fully tested yet but confirmed that the settings in the auth-keys.xml are included in Configuration instance. Thanks Jonathan Eagles .
          Hide
          jeagles Jonathan Eagles added a comment -

          Thanks, Akira Ajisaka. Committed the addendum patch to trunk and branch-2. Please reply with final results.

          Show
          jeagles Jonathan Eagles added a comment - Thanks, Akira Ajisaka . Committed the addendum patch to trunk and branch-2. Please reply with final results.
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11501 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11501/)
          HADOOP-14216. Addendum to Improve Configuration XML Parsing Performance (jeagles: rev 1309c585fb9f632f7c649464ecbe358c5130b142)

          • (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11501 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11501/ ) HADOOP-14216 . Addendum to Improve Configuration XML Parsing Performance (jeagles: rev 1309c585fb9f632f7c649464ecbe358c5130b142) (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Just tracked this down as the likely cause of my S3A test failures. This is pulling in core-site.xml, which then xincludes auth-keys.xml, which finally references an absolute path, file://home/stevel/(secret)/aws-keys.xml. This is failing for me even with the latest patch in. Either transient XIncludes aren't being picked up or

          
          testProviderAbstractClass(org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider)  Time elapsed: 0.005 sec  <<< ERROR!
          java.lang.RuntimeException: java.io.IOException: Fetch fail on include with no fallback while loading 'core-site.xml'
          	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2831)
          	at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2777)
          	at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2657)
          	at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2545)
          	at org.apache.hadoop.conf.Configuration.set(Configuration.java:1238)
          	at org.apache.hadoop.conf.Configuration.set(Configuration.java:1210)
          	at org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider.expectProviderInstantiationFailure(TestS3AAWSCredentialsProvider.java:224)
          	at org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider.testProviderAbstractClass(TestS3AAWSCredentialsProvider.java:60)
          

          Note also I think the error could be improved. 1. It's in the included file where the problem appears to lie and 2. we should really know the missing entry. Perhaps a wiki link too: I had to read the XInclude spec to work out what was going on here before I could go back to finding the cause

          Show
          stevel@apache.org Steve Loughran added a comment - Just tracked this down as the likely cause of my S3A test failures. This is pulling in core-site.xml, which then xincludes auth-keys.xml, which finally references an absolute path, file://home/stevel/(secret)/aws-keys.xml . This is failing for me even with the latest patch in. Either transient XIncludes aren't being picked up or testProviderAbstractClass(org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider) Time elapsed: 0.005 sec <<< ERROR! java.lang.RuntimeException: java.io.IOException: Fetch fail on include with no fallback while loading 'core-site.xml' at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2831) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2777) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2657) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2545) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1238) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1210) at org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider.expectProviderInstantiationFailure(TestS3AAWSCredentialsProvider.java:224) at org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider.testProviderAbstractClass(TestS3AAWSCredentialsProvider.java:60) Note also I think the error could be improved. 1. It's in the included file where the problem appears to lie and 2. we should really know the missing entry. Perhaps a wiki link too: I had to read the XInclude spec to work out what was going on here before I could go back to finding the cause
          Hide
          stevel@apache.org Steve Loughran added a comment -

          This is breaking XInclude for me, which I'm using to pull in resources (aws credentials) via an XInclude to a file:// URL in the resource /auth-keys.xml, which is itself pulled in from core-site.xml

          Here's details on my setup.

          It's failing, even on the non IT tests, the ones which don't need a set of credentials to work. They still load in core-site, they still want to pull in XIncludes. They now fail.

          I tried using the xi: prefix explicitly, but no, nothing there.

            <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
              href="file:///home/stevel/.aws/html-keys.xml" >
            </xi:include>
          

          (note: not the reai path, before anyone thinks of a way to steal my secrets)

          What does work is remove the file:// prefix:

            <include xmlns="http://www.w3.org/2001/XInclude"
              href="///home/stevel/.aws/html-keys.xml" >
            </include>
          

          Makes me thing the issue here is the fallback logic: if the XInclude href is a full URI, it should be used as is. Also, if the file is missing: log @ info before falling back, so people get a hint of what is playing up.

          I now know enough about the problem to change my auth-keys file, so get tests running again. However, the XInclude reference logic has changed —I don't know who else is expecting file:// or other other references to work.

          Show
          stevel@apache.org Steve Loughran added a comment - This is breaking XInclude for me, which I'm using to pull in resources (aws credentials) via an XInclude to a file:// URL in the resource /auth-keys.xml, which is itself pulled in from core-site.xml Here's details on my setup . It's failing, even on the non IT tests, the ones which don't need a set of credentials to work. They still load in core-site, they still want to pull in XIncludes. They now fail. I tried using the xi: prefix explicitly, but no, nothing there. <xi:include xmlns:xi= "http: //www.w3.org/2001/XInclude" href= "file: ///home/stevel/.aws/html-keys.xml" > </xi:include> (note: not the reai path, before anyone thinks of a way to steal my secrets) What does work is remove the file:// prefix: <include xmlns= "http: //www.w3.org/2001/XInclude" href= " ///home/stevel/.aws/html-keys.xml" > </include> Makes me thing the issue here is the fallback logic: if the XInclude href is a full URI, it should be used as is. Also, if the file is missing: log @ info before falling back, so people get a hint of what is playing up. I now know enough about the problem to change my auth-keys file, so get tests running again. However, the XInclude reference logic has changed —I don't know who else is expecting file:// or other other references to work.
          Hide
          jeagles Jonathan Eagles added a comment -

          Sorry guys. Have been OOO this week. Will take a look at this soon to understand this fully. The logging for a missing xinclude could be done quite easily. In previous versions there was no logging, so I have to wonder what the logic should be. xinclude with fallback simply means include the file only if it is present, fail otherwise unless a fallback is specified. The above failure seems like a missed case, but the extra logging could be helpful. What do you think?

          Show
          jeagles Jonathan Eagles added a comment - Sorry guys. Have been OOO this week. Will take a look at this soon to understand this fully. The logging for a missing xinclude could be done quite easily. In previous versions there was no logging, so I have to wonder what the logic should be. xinclude with fallback simply means include the file only if it is present, fail otherwise unless a fallback is specified. The above failure seems like a missed case, but the extra logging could be helpful. What do you think?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I the problem I'm seeing isn't fallback related, it's that the Xinclude you've had to implement can't handle URIs in the reference. I think it should see if the ref for the include can be used in a new URI() call before trying to create a file:// URI from it; if it can become a URI, then it's good to go as is

          Show
          stevel@apache.org Steve Loughran added a comment - I the problem I'm seeing isn't fallback related, it's that the Xinclude you've had to implement can't handle URIs in the reference. I think it should see if the ref for the include can be used in a new URI() call before trying to create a file:// URI from it; if it can become a URI, then it's good to go as is
          Hide
          jeagles Jonathan Eagles added a comment -

          Exactly, I'll put a patch up with a test using a full URI to ensure this doesn't go missing.

          Show
          jeagles Jonathan Eagles added a comment - Exactly, I'll put a patch up with a test using a full URI to ensure this doesn't go missing.
          Hide
          andrew.wang Andrew Wang added a comment -

          Did the follow-on work for the full-URI XInclude href happen? I'd like to track that in a follow-on JIRA and resolve this one, since it was committed.

          Show
          andrew.wang Andrew Wang added a comment - Did the follow-on work for the full-URI XInclude href happen? I'd like to track that in a follow-on JIRA and resolve this one, since it was committed.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          Not yet; there's also HADOOP-14387 which is even more serious.

          I think I'm finding all these first just because I'm the person doing the most two-hop-testing with trunk; that doesn't mean they aren't going to surface in the real world.

          Show
          stevel@apache.org Steve Loughran added a comment - Not yet; there's also HADOOP-14387 which is even more serious. I think I'm finding all these first just because I'm the person doing the most two-hop-testing with trunk; that doesn't mean they aren't going to surface in the real world.
          Hide
          andrew.wang Andrew Wang added a comment -

          I'm going to re-resolving this so the release notes are consistent. If we revert this JIRA, then it's appropriate to re-open.

          I'll file a follow-on issue for tracking.

          Show
          andrew.wang Andrew Wang added a comment - I'm going to re-resolving this so the release notes are consistent. If we revert this JIRA, then it's appropriate to re-open. I'll file a follow-on issue for tracking.
          Hide
          andrew.wang Andrew Wang added a comment -

          Created HADOOP-14399 for tracking. I pasted bits of comments into the description, please feel free to edit and improve.

          Show
          andrew.wang Andrew Wang added a comment - Created HADOOP-14399 for tracking. I pasted bits of comments into the description, please feel free to edit and improve.

            People

            • Assignee:
              jeagles Jonathan Eagles
              Reporter:
              jeagles Jonathan Eagles
            • Votes:
              0 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development