Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-5535 Umbrella jira for improved HDFS rolling upgrades
  3. HDFS-5446

Consider supporting a mechanism to allow datanodes to drain outstanding work during rolling upgrade

    Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: datanode
    • Labels:
      None

      Description

      Rebuilding write pipelines is expensive and this can happen many times during a rolling restart of datanodes (i.e. during a rolling upgrade). It seems like it might help if datanodes could be told to drain current work while rejecting new requests - possibly with a new response indicating the node is temporarily unavailable (it's not broken, it's just going through a maintenance phase where it shouldn't accept new work).

      Waiting just a few seconds is normally enough to clear up a good percentage of the open requests without error, thus reducing the overhead associated with restarting lots of datanodes in rapid succession.

      Obviously would need a timeout to make sure the datanode doesn't wait forever.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        99d 5h 16m 1 Kihwal Lee 07/Feb/14 23:09
        Hide
        Kihwal Lee added a comment -

        After the OOB acking feature, I believe we can make DN tell writers to move out more easily. Although this is less useful for rolling upgrades, it can solve the problem of decommissioning nodes with long slow writers. Clients will be able to migrate their writes to another node, so even the blocks with single replica will continue to work.

        Show
        Kihwal Lee added a comment - After the OOB acking feature, I believe we can make DN tell writers to move out more easily. Although this is less useful for rolling upgrades, it can solve the problem of decommissioning nodes with long slow writers. Clients will be able to migrate their writes to another node, so even the blocks with single replica will continue to work.
        Kihwal Lee made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Won't Fix [ 2 ]
        Nathan Roberts made changes -
        Field Original Value New Value
        Parent HDFS-5535 [ 12680353 ]
        Issue Type Improvement [ 4 ] Sub-task [ 7 ]
        Hide
        Andrew Wang added a comment -

        Very interesting idea, thanks for filing this Nathan. Are you thinking this would be a NN-side thing like DN decommissioning? The two are kind of similar; decommissioning DNs aren't assigned more blocks to write, and I believe are deprioritized for reads as well. For rolling restart, we just wouldn't be moving the blocks off. It'd be up to the admin to toggle/untoggle this state as the rolling restart progresses.

        Show
        Andrew Wang added a comment - Very interesting idea, thanks for filing this Nathan. Are you thinking this would be a NN-side thing like DN decommissioning? The two are kind of similar; decommissioning DNs aren't assigned more blocks to write, and I believe are deprioritized for reads as well. For rolling restart, we just wouldn't be moving the blocks off. It'd be up to the admin to toggle/untoggle this state as the rolling restart progresses.
        Nathan Roberts created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Nathan Roberts
          • Votes:
            1 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development