Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11000

HAServiceProtocol's health state is incorrectly transitioned to SERVICE_NOT_RESPONDING

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.7.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When HAServiceProtocol.monitorHealth throws a HealthCheckFailedException, the actual exception from protocol buffer RPC is a RemoteException that wraps the real exception. Thus the state is incorrectly transitioned to SERVICE_NOT_RESPONDING

      HealthMonitor.java
      doHealthChecks
      
            try {
              status = proxy.getServiceStatus();
              proxy.monitorHealth();
              healthy = true;
            } catch (HealthCheckFailedException e) {
              .....
              enterState(State.SERVICE_UNHEALTHY);
            } catch (Throwable t) {
              .....
              enterState(State.SERVICE_NOT_RESPONDING);
              .....
            }
      
      

        Attachments

        1. HADOOP-11000-2.patch
          9 kB
          Ming Ma
        2. HADOOP-11000.patch
          9 kB
          Ming Ma

          Activity

            People

            • Assignee:
              mingma Ming Ma
              Reporter:
              mingma Ming Ma
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: