Hadoop Common
  1. Hadoop Common
  2. HADOOP-6756

Clean up and add documentation for configuration keys in CommonConfigurationKeys.java

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.3, 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: fs
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Configuration keys in CommonConfigurationKeys.java should be cleaned up and documented (javadoc comments, appropriate *-default.xml descriptions).

      1. jira.HADOOP-6756-0.20-1-FS_DEFAULT_NAME_KEY.patch
        0.7 kB
        Erik Steffl
      2. jira.HADOOP-6756-0.20-1.patch
        9 kB
        Erik Steffl
      3. jira.HADOOP-6756-0.20.patch
        9 kB
        Erik Steffl
      4. HADOOP-6756-0.22.0-3.patch
        26 kB
        Erik Steffl
      5. HADOOP-6756-0.22.0-2.patch
        26 kB
        Erik Steffl
      6. HADOOP-6756-0.22.0-1.patch
        26 kB
        Erik Steffl
      7. HADOOP-6756-0.22.0.patch
        25 kB
        Erik Steffl

        Activity

        Hide
        Erik Steffl added a comment -

        Proposed fix for 0.20:

        Most of the keys in CommonConfigurationKeys.java are not used in 0.20, the variables
        should be pruned down to:

        • FS_DEFAULT_NAME_KEY "fs.defaultFS", not in any xml file, should be fs.default.name which is in
          core-default.xml
        • HADOOP_SECURITY_GROUP_MAPPING "hadoop.security.group.mapping" not in any xml file, add to
          core-default.xml, default value is ShellBasedUnixGroupsMapping
        • HADOOP_SECURITY_AUTHENTICATION "hadoop.security.authentication" in src/core/core-default.xml
        • HADOOP_SECURITY_AUTHORIZATION "hadoop.security.authorization" in src/core/core-default.xml
        • HADOOP_SECURITY_SERVICE_USER_NAME_KEY "hadoop.security.service.user.name.key" not in any configuration
          file, add a comment to core-default.xml, it has no default value
        • HADOOP_CLUSTER_ADMINISTRATORS_PROPERTY "hadoop.cluster.administrators" in src/core/core-default.xml

        The javadoc comments will refer to core-default.xml file.

        Show
        Erik Steffl added a comment - Proposed fix for 0.20: Most of the keys in CommonConfigurationKeys.java are not used in 0.20, the variables should be pruned down to: FS_DEFAULT_NAME_KEY "fs.defaultFS", not in any xml file, should be fs.default.name which is in core-default.xml HADOOP_SECURITY_GROUP_MAPPING "hadoop.security.group.mapping" not in any xml file, add to core-default.xml, default value is ShellBasedUnixGroupsMapping HADOOP_SECURITY_AUTHENTICATION "hadoop.security.authentication" in src/core/core-default.xml HADOOP_SECURITY_AUTHORIZATION "hadoop.security.authorization" in src/core/core-default.xml HADOOP_SECURITY_SERVICE_USER_NAME_KEY "hadoop.security.service.user.name.key" not in any configuration file, add a comment to core-default.xml, it has no default value HADOOP_CLUSTER_ADMINISTRATORS_PROPERTY "hadoop.cluster.administrators" in src/core/core-default.xml The javadoc comments will refer to core-default.xml file.
        Hide
        Hairong Kuang added a comment -

        Minor comments:
        1. for javadoc comments, instead of "see core-default.xml", use a sentence to describe it.
        2. last indention change is not needed.

        Show
        Hairong Kuang added a comment - Minor comments: 1. for javadoc comments, instead of "see core-default.xml", use a sentence to describe it. 2. last indention change is not needed.
        Hide
        Erik Steffl added a comment -

        Re Hairong's feedback: we discussed this and decided to just use reference to core-defualt.xml (full filename) so that the documentation is not duplicated. As far as intendation goes it was consistent with other lines, however that config key was removed in the meantime so it doesn't matter anymore.

        Other changes:

        Show
        Erik Steffl added a comment - Re Hairong's feedback: we discussed this and decided to just use reference to core-defualt.xml (full filename) so that the documentation is not duplicated. As far as intendation goes it was consistent with other lines, however that config key was removed in the meantime so it doesn't matter anymore. Other changes: in the meantime HADOOP_CLUSTER_ADMINISTRATORS_PROPERTY was removed, see patch https://issues.apache.org/jira/secure/attachment/12443928/patch-1754-ydist.txt .
        Hide
        Hairong Kuang added a comment -

        should FS_DEFAULT_NAME_KEY be fs.default.name for the .20 patch?

        Show
        Hairong Kuang added a comment - should FS_DEFAULT_NAME_KEY be fs.default.name for the .20 patch?
        Hide
        Erik Steffl added a comment -

        For 0.22: issue of io.sort.mb and io.sort.factor should be resolved but is outside of scope of this jira, filed https://issues.apache.org/jira/browse/HADOOP-6801 io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java

        Show
        Erik Steffl added a comment - For 0.22: issue of io.sort.mb and io.sort.factor should be resolved but is outside of scope of this jira, filed https://issues.apache.org/jira/browse/HADOOP-6801 io.sort.mb and io.sort.factor were renamed and moved to mapreduce but are still in CommonConfigurationKeysPublic.java and used in SequenceFile.java
        Hide
        Erik Steffl added a comment -

        For 0.22: seems like FS_CLIENT_BUFFER_DIR_KEY = "fs.client.buffer.dir" should be removed, filed https://issues.apache.org/jira/browse/HADOOP-6802 Remove FS_CLIENT_BUFFER_DIR_KEY = "fs.client.buffer.dir" from CommonConfigurationKeys.java (not used, deprecated)

        Show
        Erik Steffl added a comment - For 0.22: seems like FS_CLIENT_BUFFER_DIR_KEY = "fs.client.buffer.dir" should be removed, filed https://issues.apache.org/jira/browse/HADOOP-6802 Remove FS_CLIENT_BUFFER_DIR_KEY = "fs.client.buffer.dir" from CommonConfigurationKeys.java (not used, deprecated)
        Hide
        Konstantin Shvachko added a comment -
        1. In core-default.xml some new fields have been introduce without default values.
          The default values should be there in order to define the default behavior and also as an example.
        2. Some fields new fields in core-default.xml are commented out. What does that mean?
          If they are hidden (undocumented) variables, then they should not be core-default.xml,
          if they are public, then they should not be commented out and have a reasonable default value.
        3. hadoop.security.group.mapping is listed twice in core-default.xml.
          Remove the one with empty value, and merge the descriptions.
        4. I think it makes sense to introduce the security section in core-default.xml and group all security fields under it:
           <!--- security properties -->
          <property>
            <name>hadoop.security.group.mapping</name>
          ...
          
        5. I did not understand the description of io.map.index.interval.
        6. In CommonConfigurationKeysPublic I am not sure where "See src/core/core-default.xml" should lead to.
          I'd propose to use something like this:
          /** See  <a href="{@docRoot}/common-default.xml">common-default.xml</a> */
          
        7. The comment in CommonConfigurationKeys
          //TBD: Code is not updated to use following keys.
          //These keys will be used in later versions
          

          should be investigated in terms of which variables it is referred to.
          In the least it is misplaced, because FS_AUTOMATIC_CLOSE_KEY and FS_FTP_HOST_KEY are definitely used.
          If it is a legacy artifact, then let's remove it.

        Show
        Konstantin Shvachko added a comment - In core-default.xml some new fields have been introduce without default values. The default values should be there in order to define the default behavior and also as an example. Some fields new fields in core-default.xml are commented out. What does that mean? If they are hidden (undocumented) variables, then they should not be core-default.xml, if they are public, then they should not be commented out and have a reasonable default value. hadoop.security.group.mapping is listed twice in core-default.xml. Remove the one with empty value, and merge the descriptions. I think it makes sense to introduce the security section in core-default.xml and group all security fields under it: <!--- security properties --> <property> <name>hadoop.security.group.mapping</name> ... I did not understand the description of io.map.index.interval . In CommonConfigurationKeysPublic I am not sure where "See src/core/core-default.xml" should lead to. I'd propose to use something like this: /** See <a href= "{@docRoot}/common- default .xml" >common- default .xml</a> */ The comment in CommonConfigurationKeys //TBD: Code is not updated to use following keys. //These keys will be used in later versions should be investigated in terms of which variables it is referred to. In the least it is misplaced, because FS_AUTOMATIC_CLOSE_KEY and FS_FTP_HOST_KEY are definitely used. If it is a legacy artifact, then let's remove it.
        Hide
        Erik Steffl added a comment -

        Re https://issues.apache.org/jira/browse/HADOOP-6756?focusedCommentId=12875770&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12875770

        1, 2: some fields do not have reasonable defaults therefore are commented out and are without default values (I was told that empty values are not handled well by Configuration), if you can think of reasonable defaults for these please let me know

        3: will fix

        4: reasonable, will do

        5: Let's discuss off-line and come up with something that makes sense, it's my best guess from looking at code

        6: This is a bit tricky, there's core-default.html which is built by ant docs (not ant javadocs) and resulting core-default.html is outside of javadoc docRoot. It can be referred to using "

        {@docRoot}

        /../core-default.html" (note the parent directory of docRoot). This makes the end result ok for now but it's dangerous because it makes javadoc docs rely on docs built outside of javadoc.

        7: This comment was there before, thought it's out of scope for this bug, i.e. I am trying to document existing variables, not decide which ones are used or not etc.

        Show
        Erik Steffl added a comment - Re https://issues.apache.org/jira/browse/HADOOP-6756?focusedCommentId=12875770&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12875770 1, 2: some fields do not have reasonable defaults therefore are commented out and are without default values (I was told that empty values are not handled well by Configuration), if you can think of reasonable defaults for these please let me know 3: will fix 4: reasonable, will do 5: Let's discuss off-line and come up with something that makes sense, it's my best guess from looking at code 6: This is a bit tricky, there's core-default.html which is built by ant docs (not ant javadocs) and resulting core-default.html is outside of javadoc docRoot. It can be referred to using " {@docRoot} /../core-default.html" (note the parent directory of docRoot). This makes the end result ok for now but it's dangerous because it makes javadoc docs rely on docs built outside of javadoc. 7: This comment was there before, thought it's out of scope for this bug, i.e. I am trying to document existing variables, not decide which ones are used or not etc.
        Hide
        Erik Steffl added a comment -

        Patch https://issues.apache.org/jira/secure/attachment/12446625/HADOOP-6756-0.22.0-1.patch resolves issues of previous patch as described in comment https://issues.apache.org/jira/browse/HADOOP-6756?focusedCommentId=12876461&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12876461

        Description of changes:

        1,2: we decided upon default values for previously commented out fields, all new config parameters are uncommented now

        3: fixed

        4: done

        5: new description, hopefully it makes sense now

        6: "

        {@docRoot}

        /../core-default.html" is used to refer to core-default.xml, it's a document created by ant docs (forrest generated docs)

        7: made the comment clear

        Show
        Erik Steffl added a comment - Patch https://issues.apache.org/jira/secure/attachment/12446625/HADOOP-6756-0.22.0-1.patch resolves issues of previous patch as described in comment https://issues.apache.org/jira/browse/HADOOP-6756?focusedCommentId=12876461&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12876461 Description of changes: 1,2: we decided upon default values for previously commented out fields, all new config parameters are uncommented now 3: fixed 4: done 5: new description, hopefully it makes sense now 6: " {@docRoot} /../core-default.html" is used to refer to core-default.xml, it's a document created by ant docs (forrest generated docs) 7: made the comment clear
        Hide
        Konstantin Shvachko added a comment -

        +1 on the patch.

        Show
        Konstantin Shvachko added a comment - +1 on the patch.
        Hide
        Erik Steffl added a comment -

        Patch HADOOP-6756-0.22.0-2.patch has a new improved documentation for io.map.index.interval.

        Show
        Erik Steffl added a comment - Patch HADOOP-6756 -0.22.0-2.patch has a new improved documentation for io.map.index.interval.
        Hide
        Erik Steffl added a comment -

        HADOOP-6756-0.22.0-3.patch applies to current trunk on 2010/6/2

        Show
        Erik Steffl added a comment - HADOOP-6756 -0.22.0-3.patch applies to current trunk on 2010/6/2
        Hide
        Konstantin Shvachko added a comment -

        I just committed this. Thank you Erik.

        Show
        Konstantin Shvachko added a comment - I just committed this. Thank you Erik.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #318 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/318/)
        HADOOP-6756. Documentation for common configuration keys. Contributed by Erik Steffl.

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #318 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk-Commit/318/ ) HADOOP-6756 . Documentation for common configuration keys. Contributed by Erik Steffl.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk #385 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/385/)
        HADOOP-6756. Documentation for common configuration keys. Contributed by Erik Steffl.

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk #385 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/385/ ) HADOOP-6756 . Documentation for common configuration keys. Contributed by Erik Steffl.

          People

          • Assignee:
            Erik Steffl
            Reporter:
            Erik Steffl
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development