Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12643

HDFS maintenance state behaviour is confusing and not well documented

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • documentation, namenode
    • None

    Description

      The current implementation of the HDFS maintenance state feature is confusing and error-prone. The documentation is missing important information that's required for the correct use of the feature.

      For example, if the Hadoop admin wants to put a single node in maintenance state, he/she can add a single entry to the maintenance file with the contents:

      {
         "hostName": "host-1.example.com",
         "adminState": "IN_MAINTENANCE",
         "maintenanceExpireTimeInMS": 1507663698000
      }
      

      Let's say now that the actual maintenance finished well before the set expiration time and the Hadoop admin wants to bring the node back to NORMAL state. It would be natural to simply change the state of the node, as show below, and run another refresh:

      {
         "hostName": "host-1.example.com",
         "adminState": "NORMAL"
      }
      

      The configuration file above, though, not only take the node host-1 out of maintenance state but it also blacklists all the other DataNodes. This behaviour seems inconsistent to me and is due to emptyInServiceNodeLists being set to false here only when there is at least one node with adminState = NORMAL listed in the file.

      I believe that it would be more consistent, and less error prone, to simply implement the following:

      • If the dfs.hosts file is empty, all nodes are allowed and in normal state
      • If the file is not empty, any host not listed in the file is blacklisted, regardless of the state of the hosts listed in the file.

      Regardless of the implementation being changed or not, the documentation also needs to be updated to ensure the readers know of the caveats mentioned above.

      Attachments

        Activity

          People

            Unassigned Unassigned
            asdaraujo Andre Araujo
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated: