Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-18470

RU/EU cannot start because ServiceCheckValidityCheck incorrectly calculates Service Checks that ran

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.0
    • 2.5.0
    • ambari-server
    • None

    Description

      RU/EU PreCheck for running ServiceChecks after config changes is incorrect.

      Example:

      Last Service Check should be more recent than the last configuration change for the given service 
      Reason: The following service configurations have been updated and their Service Checks should be run again: HIVE, SPARK, RANGER, YARN 
      Failed on: HIVE,SPARK,RANGER,YARN
      

      Workaround:
      Add stack.upgrade.bypass.prechecks=true to ambari.properties and restart ambari.

      The workaround will report the same problem but allow the RU/EU to proceed.

      Logs:

      21 Sep 2016 10:59:20,905 WARN [ambari-client-thread-172] Errors:173 - The following warnings have been detected with resource and/or provider classes: 
      WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.ComponentService.getComponents(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo), should not consume any entity. 
      WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.ComponentService.getComponent(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String,java.lang.String), should not consume any entity. 
      21 Sep 2016 10:59:35,690 INFO [ambari-client-thread-168] ServiceCheckValidityCheck:144 - Service HIVE latest config change is 09-20-2016 02:04:55, latest service check executed at 09-20-2016 01:47:31 
      21 Sep 2016 10:59:35,804 INFO [ambari-client-thread-168] ServiceCheckValidityCheck:144 - Service SPARK latest config change is 09-20-2016 10:58:58, latest service check executed at 04-15-2016 10:32:53 
      21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-168] ServiceCheckValidityCheck:154 - Service RANGER service check has never been executed 
      21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-168] ServiceCheckValidityCheck:144 - Service YARN latest config change is 09-20-2016 01:47:31, latest service check executed at 09-20-2016 01:44:43 
      21 Sep 2016 10:59:35,805 INFO [ambari-client-thread-172] ServiceCheckValidityCheck:144 - Service HIVE latest config change is 09-20-2016 02:04:54, latest service check executed at 09-20-2016 01:44:43 
      21 Sep 2016 10:59:35,808 INFO [ambari-client-thread-172] ServiceCheckValidityCheck:144 - Service SPARK latest config change is 09-20-2016 10:58:58, latest service check executed at 04-15-2016 10:32:53 
      21 Sep 2016 10:59:35,809 INFO [ambari-client-thread-172] ServiceCheckValidityCheck:154 - Service RANGER service check has never been executed 
      21 Sep 2016 10:59:35,810 INFO [ambari-client-thread-172] ServiceCheckValidityCheck:144 - Service YARN latest config change is 09-20-2016 02:04:54, latest service check executed at 09-20-2016 01:44:43 
      

      Root Cause:
      When the database has more than 1000 Service Checks, the EU/RU PreCheck for ensuring that a Service Check has ran after any config changes to a service is incorrect because it takes the first 1000 HostRoleCommand records as opposed to the last page of 1000.

      This is because ServiceCheckValidityCheck.java doesn't impose an ordering when creating a pagination request from TaskResourceProvider.java

      Attachments

        1. AMBARI-18470.branch-2.5.patch
          5 kB
          Alejandro Fernandez
        2. AMBARI-18470.trunk.patch
          5 kB
          Alejandro Fernandez

        Issue Links

          Activity

            People

              afernandez Alejandro Fernandez
              smayani Saumil Mayani
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: