Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-6184

Incorrect value for started_count of Datanode component

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.1
    • 1.6.1
    • ambari-agent
    • None

    Description

      STR:

      1. Installed a 3-node cluster for HDP 1.3 stack HDFS+MapReduce+Nagios+Ganglia+zooKeeper installed with slave components installed on all 3 hosts.
      2. Enable security with no kerberos setup
      3. On expected failure of security wizard, Disable security.
      4. After successfully disabling security, Following API returns incorrect number for started_count of Datanode. It says 0 but Datanode is actually running on all hosts
        http://server:8080/api/v1/clusters/c1/components/?ServiceComponentInfo/category.in(SLAVE,CLIENT)&fields=ServiceComponentInfo/service_name,ServiceComponentInfo/installed_count,ServiceComponentInfo/started_count,ServiceComponentInfo/total_count&minimal_response=true
        

      Reason:
      During wrong kerberos setup DN processes fail to start, but leave stale pid file owned by root. Next one DN start command starts DN process, but can not override pid file. So the server considers DN as stopped. If we start DN once more, commands fail soon after start (due to lock file at data dir owned by already running DN). Agent reports to server that DN is not running, so server displays a correct information from his point of view.

      Attachments

        Issue Links

          Activity

            People

              dmitriusan Dmitry Lysnichenko
              dmitriusan Dmitry Lysnichenko
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: