Hadoop Common
  1. Hadoop Common
  2. HADOOP-10623

Provide a utility to be able inspect the config as seen by a hadoop client / daemon

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Target Version/s:

      Description

      To ease debugging of config issues it is convenient to be able to generate a config as seen by the job client or a hadoop daemon

      ]$ hadoop org.apache.hadoop.util.ConfigTool -help 
      Usage: ConfigTool [ -xml | -json ] [ -loadDefaults ] [ resource1... ]
            if resource contains '/', load from local filesystem
            otherwise, load from the classpath
      
      Generic options supported are
      -conf <configuration file>     specify an application configuration file
      -D <property=value>            use value for given property
      -fs <local|namenode:port>      specify a namenode
      -jt <local|jobtracker:port>    specify a job tracker
      -files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
      -libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
      -archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
      
      The general command line syntax is
      bin/hadoop command [genericOptions] [commandOptions]
      
      $ hadoop org.apache.hadoop.util.ConfigTool -Dmy.test.conf=val mapred-site.xml ./hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/etc/hadoop/core-site.xml | python -mjson.tool
      {
          "properties": [
              {
                  "isFinal": false,
                  "key": "mapreduce.framework.name",
                  "resource": "mapred-site.xml",
                  "value": "yarn"
              },
              {
                  "isFinal": false,
                  "key": "mapreduce.client.genericoptionsparser.used",
                  "resource": "programatically",
                  "value": "true"
              },
              {
                  "isFinal": false,
                  "key": "my.test.conf",
                  "resource": "from command line",
                  "value": "val"
              },
              {
                  "isFinal": false,
                  "key": "from.file.key",
                  "resource": "hadoop-dist/target/hadoop-3.0.0-SNAPSHOT/etc/hadoop/core-site.xml",
                  "value": "from.file.val"
              },
              {
                  "isFinal": false,
                  "key": "mapreduce.shuffle.port",
                  "resource": "mapred-site.xml",
                  "value": "${my.mapreduce.shuffle.port}"
              }
          ]
      }
      
      1. HADOOP-10623.v01.patch
        4 kB
        Gera Shegalov
      2. HADOOP-10623.v02.patch
        6 kB
        Gera Shegalov
      3. HADOOP-10623.v03.patch
        7 kB
        Gera Shegalov
      4. HADOOP-10623.v04.patch
        7 kB
        Gera Shegalov

        Issue Links

          Activity

          Hide
          Gera Shegalov added a comment -

          v01 patch for review

          Show
          Gera Shegalov added a comment - v01 patch for review
          Hide
          Gera Shegalov added a comment -

          Added ability to load the config from

          • an arbitrary filesystem (helps digesting job.xml from a staging submit dir)
          • include only a certain key in the
          Show
          Gera Shegalov added a comment - Added ability to load the config from an arbitrary filesystem (helps digesting job.xml from a staging submit dir) include only a certain key in the
          Hide
          Gera Shegalov added a comment -

          v03: adding the option -loadSites to add *-site.xml in one shot.

          Show
          Gera Shegalov added a comment - v03: adding the option -loadSites to add *-site.xml in one shot.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12646849/HADOOP-10623.v03.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3973//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3973//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646849/HADOOP-10623.v03.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/3973//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3973//console This message is automatically generated.
          Hide
          Tsuyoshi Ozawa added a comment -

          I just glanced over the patch. This feature looks interesting and useful for Hadoop operators to confirm current configurations. +1 to this feature. Should we add tests and document of this tools?

          Show
          Tsuyoshi Ozawa added a comment - I just glanced over the patch. This feature looks interesting and useful for Hadoop operators to confirm current configurations. +1 to this feature. Should we add tests and document of this tools?
          Hide
          Haohui Mai added a comment -

          This feature shows the configuration from the classpath instead of the configuration used in run-time. Given the fact that you can get the runtime configuration from the URL {{http://service/conf}, I wonder, is it better to just load the run-time configuration instead?

          Show
          Haohui Mai added a comment - This feature shows the configuration from the classpath instead of the configuration used in run-time. Given the fact that you can get the runtime configuration from the URL {{ http://service/conf }, I wonder, is it better to just load the run-time configuration instead?
          Hide
          Gera Shegalov added a comment -

          Haohui Mai, The purpose of this tool is complimentary to the displaying the conf of a running service.

          • It can load conf from classpath, local path or FileSystem (e.g., job conf from the staging area)
          • it allows to you to review separately *-default.xml and overlaid with *-site.xml and application/job conf
          • obviously there is the deficiency of not being able to see the hardcoded constants, somewhat compensated by -defval.
          • We intend to use it to validate config for sanity before rolling out to cluster, and to debug failed jobs easier from the command line especially when the AM failed before completing history.
          Show
          Gera Shegalov added a comment - Haohui Mai , The purpose of this tool is complimentary to the displaying the conf of a running service. It can load conf from classpath, local path or FileSystem (e.g., job conf from the staging area) it allows to you to review separately *-default.xml and overlaid with *-site.xml and application/job conf obviously there is the deficiency of not being able to see the hardcoded constants, somewhat compensated by -defval. We intend to use it to validate config for sanity before rolling out to cluster, and to debug failed jobs easier from the command line especially when the AM failed before completing history.
          Hide
          Steve Loughran added a comment -

          This will be useful as a way of diagnosing problems

          1. I think we really ought to have a yarn diagnostics operation, with this being a sub-operation. That way, this could be just one of the operations supported, some diagnostics --configuration ...
          2. this implies use of the org.apache.commons.cli.Options code to parse the options -this scales better than doing it all by hand.
          3. isolating parsing from the printing makes this testable
          4. Configuration.addResource() gets into trouble if the resource isn't there. it should be checked and reported as missing first (maybe a failure for the listed, or just a --failifmissing option to even bail out if, say yarn-site.xml isn't on the CP.
          Show
          Steve Loughran added a comment - This will be useful as a way of diagnosing problems I think we really ought to have a yarn diagnostics operation, with this being a sub-operation. That way, this could be just one of the operations supported, some diagnostics --configuration ... this implies use of the org.apache.commons.cli.Options code to parse the options -this scales better than doing it all by hand. isolating parsing from the printing makes this testable Configuration.addResource() gets into trouble if the resource isn't there. it should be checked and reported as missing first (maybe a failure for the listed, or just a --failifmissing option to even bail out if, say yarn-site.xml isn't on the CP.
          Hide
          Haohui Mai added a comment -

          Thanks for the explanation. This should be a useful feature.

          Note that there is a GetConf tool in hdfs – do you think that it might be cleaner if you can integrate this patch with it?

          Show
          Haohui Mai added a comment - Thanks for the explanation. This should be a useful feature. Note that there is a GetConf tool in hdfs – do you think that it might be cleaner if you can integrate this patch with it?
          Hide
          Gera Shegalov added a comment -

          Haohui Mai, it makes sense to have a common tool. It should live in hadoop-tools or hadoop-common IMO.

          Show
          Gera Shegalov added a comment - Haohui Mai , it makes sense to have a common tool. It should live in hadoop-tools or hadoop-common IMO.
          Hide
          Gera Shegalov added a comment -

          v04 is not addressing helpful review above yet. It merely demonstrates that YARN-1741 can be solved using the existing FsUrlStreamHandler

          This is the content of core-site.xml and mounttable.xml stored on the cluster

          [... hadoop-common (HADOOP-10623)]$ hadoop fs -cat viewfs:/user/gshegalov/conf/core-site.xml 
          2014-07-04 17:59:48.847 java[72316:1b03] Unable to load realm info from SCDynamicStore
          14/07/04 17:59:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          <?xml version="1.0" encoding="UTF-8"?>
          <configuration  xmlns:xi="http://www.w3.org/2001/XInclude">
            <xi:include href="mounttable.xml"/>
          
            <property>
              <name>hadoop.tmp.dir</name>
              <value>${my.hadoop.tmp.dir}</value>
            </property>
            <property>
              <name>fs.defaultFS</name>
              <value>viewfs:///</value>
            </property>
          </configuration>
          
          [... hadoop-common (HADOOP-10623)]$ hadoop fs -cat viewfs:/user/gshegalov/conf/mounttable.xml
          2014-07-04 17:59:56.705 java[72339:1903] Unable to load realm info from SCDynamicStore
          14/07/04 17:59:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          <configuration>
            <property>
              <name>fs.viewfs.mounttable.default.link./user</name>
              <value>hdfs://ns1/user</value> 
            </property>
            <property>
              <name>fs.viewfs.mounttable.default.link./tmp</name>
              <value>hdfs://ns2/tmp</value> 
            </property>
          </configuration>
          

          That's how the Configuration is able to load mounttable via core-site when core-site is added as a URL.

          [... hadoop-common (HADOOP-10623)]$ yarn org.apache.hadoop.util.ConfigTool viewfs:/user/gshegalov/conf/core-site.xml | python -mjson.tool
          2014-07-04 18:01:00.873 java[72374:1903] Unable to load realm info from SCDynamicStore
          14/07/04 18:01:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          {
              "properties": [
                  {
                      "isFinal": false,
                      "key": "mapreduce.client.genericoptionsparser.used",
                      "resource": "programatically",
                      "value": "true"
                  },
                  {
                      "isFinal": false,
                      "key": "fs.defaultFS",
                      "resource": "viewfs:/user/gshegalov/conf/core-site.xml",
                      "value": "viewfs:///"
                  },
                  {
                      "isFinal": false,
                      "key": "hadoop.tmp.dir",
                      "resource": "viewfs:/user/gshegalov/conf/core-site.xml",
                      "value": "${my.hadoop.tmp.dir}"
                  },
                  {
                      "isFinal": false,
                      "key": "fs.viewfs.mounttable.default.link./tmp",
                      "resource": "viewfs:/user/gshegalov/conf/core-site.xml",
                      "value": "hdfs://ns2/tmp"
                  },
                  {
                      "isFinal": false,
                      "key": "fs.viewfs.mounttable.default.link./user",
                      "resource": "viewfs:/user/gshegalov/conf/core-site.xml",
                      "value": "hdfs://ns1/user"
                  }
              ]
          }
          
          Show
          Gera Shegalov added a comment - v04 is not addressing helpful review above yet. It merely demonstrates that YARN-1741 can be solved using the existing FsUrlStreamHandler This is the content of core-site.xml and mounttable.xml stored on the cluster [... hadoop-common (HADOOP-10623)]$ hadoop fs -cat viewfs:/user/gshegalov/conf/core-site.xml 2014-07-04 17:59:48.847 java[72316:1b03] Unable to load realm info from SCDynamicStore 14/07/04 17:59:48 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable <?xml version= "1.0" encoding= "UTF-8" ?> <configuration xmlns:xi= "http: //www.w3.org/2001/XInclude" > <xi:include href= "mounttable.xml" /> <property> <name>hadoop.tmp.dir</name> <value>${my.hadoop.tmp.dir}</value> </property> <property> <name>fs.defaultFS</name> <value>viewfs: ///</value> </property> </configuration> [... hadoop-common (HADOOP-10623)]$ hadoop fs -cat viewfs:/user/gshegalov/conf/mounttable.xml 2014-07-04 17:59:56.705 java[72339:1903] Unable to load realm info from SCDynamicStore 14/07/04 17:59:56 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable <configuration> <property> <name>fs.viewfs.mounttable. default .link./user</name> <value>hdfs: //ns1/user</value> </property> <property> <name>fs.viewfs.mounttable. default .link./tmp</name> <value>hdfs: //ns2/tmp</value> </property> </configuration> That's how the Configuration is able to load mounttable via core-site when core-site is added as a URL . [... hadoop-common (HADOOP-10623)]$ yarn org.apache.hadoop.util.ConfigTool viewfs:/user/gshegalov/conf/core-site.xml | python -mjson.tool 2014-07-04 18:01:00.873 java[72374:1903] Unable to load realm info from SCDynamicStore 14/07/04 18:01:00 WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable { "properties" : [ { "isFinal" : false , "key" : "mapreduce.client.genericoptionsparser.used" , "resource" : "programatically" , "value" : " true " }, { "isFinal" : false , "key" : "fs.defaultFS" , "resource" : "viewfs:/user/gshegalov/conf/core-site.xml" , "value" : "viewfs: ///" }, { "isFinal" : false , "key" : "hadoop.tmp.dir" , "resource" : "viewfs:/user/gshegalov/conf/core-site.xml" , "value" : "${my.hadoop.tmp.dir}" }, { "isFinal" : false , "key" : "fs.viewfs.mounttable. default .link./tmp" , "resource" : "viewfs:/user/gshegalov/conf/core-site.xml" , "value" : "hdfs: //ns2/tmp" }, { "isFinal" : false , "key" : "fs.viewfs.mounttable. default .link./user" , "resource" : "viewfs:/user/gshegalov/conf/core-site.xml" , "value" : "hdfs: //ns1/user" } ] }
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654156/HADOOP-10623.v04.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4216//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4216//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654156/HADOOP-10623.v04.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4216//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4216//console This message is automatically generated.
          Hide
          Allen Wittenauer added a comment -

          There is also HADOOP-9044 (which I just committed) which is also along this same vein.

          Show
          Allen Wittenauer added a comment - There is also HADOOP-9044 (which I just committed) which is also along this same vein.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654156/HADOOP-10623.v04.patch
          against trunk revision 276485e.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

          org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5607//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5607//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654156/HADOOP-10623.v04.patch against trunk revision 276485e. +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/5607//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/5607//console This message is automatically generated.
          Hide
          Allen Wittenauer added a comment -

          Cancelling patch as it no longer applies.

          Show
          Allen Wittenauer added a comment - Cancelling patch as it no longer applies.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Moving features/enhancements out of previously closed releases into the next minor release 2.8.0.

          Show
          Vinod Kumar Vavilapalli added a comment - Moving features/enhancements out of previously closed releases into the next minor release 2.8.0.

            People

            • Assignee:
              Gera Shegalov
              Reporter:
              Gera Shegalov
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:

                Development