Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1726

query method for what kind of safe mode the Namenode is in

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.22.0
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None
    • Tags:
      safe mode, safemode, startup

      Description

      If we could differentiate between "startup safemode" vs other safemode, it would be easier to do startup optimizations like HDFS-1295. Looking at FSNamesystem, this can be queried, but not with a single query, and the semantics are not reliable under future changes. Also, the FSNamesystem code itself, internally, uses more than one way to test for manual safe mode.

      Proposal is to create a status field and query method in FSNamesystem with enum values

      {NOT_IN_SAFEMODE, SAFEMODE_STARTUP, SAFEMODE_EXTENSION, SAFEMODE_MANUAL}

      If in the future we add automatic fallback to safe mode, we would add value SAFEMODE_AUTOMATIC.

      This change will make it easier to do startup optimizations, and will also allow making the safemode management code in FSNamesystem simpler and more consistent.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          1315d 22h 44m 1 Haohui Mai 11/Oct/14 00:34
          Vinod Kumar Vavilapalli made changes -
          Fix Version/s 2.0.0-alpha [ 12320353 ]
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Dropping fix-version from 'non-fixed' (didn't have code-fixes) JIRAs.

          Show
          Vinod Kumar Vavilapalli added a comment - Dropping fix-version from 'non-fixed' (didn't have code-fixes) JIRAs.
          Allen Wittenauer made changes -
          Fix Version/s 2.0.0-alpha [ 12320353 ]
          Fix Version/s 0.24.0 [ 12317653 ]
          Haohui Mai made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Not a Problem [ 8 ]
          Hide
          Haohui Mai added a comment -

          It seems that it is no longer an issue in the 2.x releases.

          Show
          Haohui Mai added a comment - It seems that it is no longer an issue in the 2.x releases.
          Gavin made changes -
          Link This issue is depended upon by HDFS-1594 [ HDFS-1594 ]
          Gavin made changes -
          Link This issue blocks HDFS-1594 [ HDFS-1594 ]
          Hide
          Steve Loughran added a comment -

          It'd be nice to also have text for end users, which could also be set in a manual entry mode. This would let the ops teams add text like "down for rack maintenance back at 18:00" for the web UI and exception text

          Show
          Steve Loughran added a comment - It'd be nice to also have text for end users, which could also be set in a manual entry mode. This would let the ops teams add text like "down for rack maintenance back at 18:00" for the web UI and exception text
          Arun C Murthy made changes -
          Fix Version/s 0.24.0 [ 12317653 ]
          Fix Version/s 0.23.0 [ 12315571 ]
          Matt Foley made changes -
          Attachment SafeModeState_v2.patch [ 12474681 ]
          Hide
          Matt Foley added a comment -

          Besides creating SafeModeState and a way to query it, this patch fixes two minor bugs found during development:

          • when entering manual mode via the SafeModeInfo() ctor, sets "reached" = -1. This caused an incorrect getTurnOffTip() result, because "reached == -1" is supposed to mean safe mode is off, and it is not needed for manual mode detection, which relies on "extension = Integer.MAX_VALUE". Changed to set it to 0, consistent with being in safe mode (regardless of reason).
          • safeMode.incrementSafeBlockCount() and safeMode.decrementSafeBlockCount() call checkMode() on every call, regardless of whether they change anything that needs checkMode(). Changed to only call checkMode() if they succeed in changing the blockSafe count.

          Finally, found some inconsistencies in the use of "synchronized". Opened ticket HDFS-1790 to clarify them.

          Show
          Matt Foley added a comment - Besides creating SafeModeState and a way to query it, this patch fixes two minor bugs found during development: when entering manual mode via the SafeModeInfo() ctor, sets "reached" = -1. This caused an incorrect getTurnOffTip() result, because "reached == -1" is supposed to mean safe mode is off, and it is not needed for manual mode detection, which relies on "extension = Integer.MAX_VALUE". Changed to set it to 0, consistent with being in safe mode (regardless of reason). safeMode.incrementSafeBlockCount() and safeMode.decrementSafeBlockCount() call checkMode() on every call, regardless of whether they change anything that needs checkMode(). Changed to only call checkMode() if they succeed in changing the blockSafe count. Finally, found some inconsistencies in the use of "synchronized". Opened ticket HDFS-1790 to clarify them.
          Konstantin Boudnik made changes -
          Link This issue blocks HDFS-1594 [ HDFS-1594 ]
          Hide
          Aaron T. Myers added a comment -

          HDFS-1594 could make use of the proposed SAFEMODE_AUTOMATIC status.

          Show
          Aaron T. Myers added a comment - HDFS-1594 could make use of the proposed SAFEMODE_AUTOMATIC status.
          Aaron T. Myers made changes -
          Field Original Value New Value
          Link This issue is related to HDFS-1594 [ HDFS-1594 ]
          Matt Foley created issue -

            People

            • Assignee:
              Matt Foley
              Reporter:
              Matt Foley
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development