Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11000

HAServiceProtocol's health state is incorrectly transitioned to SERVICE_NOT_RESPONDING

VotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.7.0
    • None
    • None
    • Reviewed

    Description

      When HAServiceProtocol.monitorHealth throws a HealthCheckFailedException, the actual exception from protocol buffer RPC is a RemoteException that wraps the real exception. Thus the state is incorrectly transitioned to SERVICE_NOT_RESPONDING

      HealthMonitor.java
      doHealthChecks
      
            try {
              status = proxy.getServiceStatus();
              proxy.monitorHealth();
              healthy = true;
            } catch (HealthCheckFailedException e) {
              .....
              enterState(State.SERVICE_UNHEALTHY);
            } catch (Throwable t) {
              .....
              enterState(State.SERVICE_NOT_RESPONDING);
              .....
            }
      
      

      Attachments

        1. HADOOP-11000.patch
          9 kB
          Ming Ma
        2. HADOOP-11000-2.patch
          9 kB
          Ming Ma

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mingma Ming Ma
            mingma Ming Ma
            Votes:
            0 Vote for this issue
            Watchers:
            5 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment