Uploaded image for project: 'Metron'
  1. Metron
  2. METRON-894

Ambari "Restart Metron Parsers" Fails If Any Parser Not Running

    Details

    • Type: Bug
    • Status: To Do
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.3.1
    • Fix Version/s: None
    • Labels:
      None

      Description

      The "Restart Metron Parsers" action failed in Ambari. It failed because the "stop" portion of the "restart" failed because the YAF topology was not running. This should not be treated as an error condition.

      I was able to work around this by simply using a "start" operation instead of a "restart".

      stderr:   /var/lib/ambari-agent/data/errors-966.txt
      
      Traceback (most recent call last):
        File "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_master.py", line 93, in <module>
          ParserMaster().execute()
        File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
          method(env)
        File "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_master.py", line 81, in restart
          commands.restart_parser_topologies(env)
        File "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_commands.py", line 146, in restart_parser_topologies
          self.stop_parser_topologies()
        File "/var/lib/ambari-agent/cache/common-services/METRON/0.4.0/package/scripts/parser_commands.py", line 141, in stop_parser_topologies
          Execute(stop_cmd, user=self.__params.metron_user)
        File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__
          self.env.run()
        File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
          self.run_action(resource, action)
        File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
          provider_action()
        File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 273, in action_run
          tries=self.resource.tries, try_sleep=self.resource.try_sleep)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
          result = function(command, **kwargs)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
          tries=tries, try_sleep=try_sleep)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
          result = _call(command, **kwargs_copy)
        File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 293, in _call
          raise ExecutionFailed(err_msg, code, out, err)
      resource_management.core.exceptions.ExecutionFailed: Execution of 'storm kill yaf' returned 1. Running: /usr/jdk64/jdk1.8.0_77/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/hdp/2.5.3.0-37/storm -Dstorm.log.dir=/var/log/storm -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/hdp/current/storm-client/lib -Dstorm.conf.file= -cp /usr/hdp/2.5.3.0-37/storm/lib/clojure-1.7.0.jar:/usr/hdp/2.5.3.0-37/storm/lib/disruptor-3.3.2.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/storm-rename-hack-1.0.1.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-api-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/ring-cors-0.1.5.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-core-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/asm-5.0.3.jar:/usr/hdp/2.5.3.0-37/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/hdp/2.5.3.0-37/storm/lib/slf4j-api-1.7.7.jar:/usr/hdp/2.5.3.0-37/storm/lib/servlet-api-2.5.jar:/usr/hdp/2.5.3.0-37/storm/lib/zookeeper.jar:/usr/hdp/2.5.3.0-37/storm/lib/minlog-1.3.0.jar:/usr/hdp/2.5.3.0-37/storm/lib/kryo-3.0.3.jar:/usr/hdp/2.5.3.0-37/storm/lib/storm-core-1.0.1.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/lib/reflectasm-1.10.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/objenesis-2.1.jar:/usr/hdp/2.5.3.0-37/storm/lib/ambari-metrics-storm-sink.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ranger-storm-plugin-shim-0.6.0.2.5.3.0-37.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ojdbc6.jar:/usr/hdp/2.5.3.0-37/storm/extlib-daemon/ranger-plugin-classloader-0.6.0.2.5.3.0-37.jar:/usr/hdp/current/storm-supervisor/conf:/usr/hdp/2.5.3.0-37/storm/bin org.apache.storm.command.kill_topology yaf
      Exception in thread "main" NotAliveException(msg:yaf is not alive)
      	at org.apache.storm.generated.Nimbus$killTopologyWithOpts_result$killTopologyWithOpts_resultStandardScheme.read(Nimbus.java:10748)
      	at org.apache.storm.generated.Nimbus$killTopologyWithOpts_result$killTopologyWithOpts_resultStandardScheme.read(Nimbus.java:10734)
      	at org.apache.storm.generated.Nimbus$killTopologyWithOpts_result.read(Nimbus.java:10676)
      	at org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
      	at org.apache.storm.generated.Nimbus$Client.recv_killTopologyWithOpts(Nimbus.java:383)
      	at org.apache.storm.generated.Nimbus$Client.killTopologyWithOpts(Nimbus.java:369)
      	at org.apache.storm.command.kill_topology$_main.doInvoke(kill_topology.clj:27)
      	at clojure.lang.RestFn.applyTo(RestFn.java:137)
      	at org.apache.storm.command.kill_topology.main(Unknown Source)
      stdout:   /var/lib/ambari-agent/data/output-966.txt
      
      2017-04-26 18:21:46,880 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.3.0-37
      2017-04-26 18:21:46,882 - Checking if need to create versioned conf dir /etc/hadoop/2.5.3.0-37/0
      2017-04-26 18:21:46,884 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
      2017-04-26 18:21:46,921 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist already', '')
      2017-04-26 18:21:46,922 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
      2017-04-26 18:21:46,960 - checked_call returned (0, '')
      2017-04-26 18:21:46,962 - Ensuring that hadoop has the correct symlink structure
      2017-04-26 18:21:46,962 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
      2017-04-26 18:21:47,150 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.3.0-37
      2017-04-26 18:21:47,152 - Checking if need to create versioned conf dir /etc/hadoop/2.5.3.0-37/0
      2017-04-26 18:21:47,155 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
      2017-04-26 18:21:47,193 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist already', '')
      2017-04-26 18:21:47,194 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
      2017-04-26 18:21:47,232 - checked_call returned (0, '')
      2017-04-26 18:21:47,233 - Ensuring that hadoop has the correct symlink structure
      2017-04-26 18:21:47,233 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
      2017-04-26 18:21:47,235 - Group['metron'] {}
      2017-04-26 18:21:47,238 - Group['livy'] {}
      2017-04-26 18:21:47,238 - Group['elasticsearch'] {}
      2017-04-26 18:21:47,238 - Group['spark'] {}
      2017-04-26 18:21:47,239 - Group['zeppelin'] {}
      2017-04-26 18:21:47,239 - Group['hadoop'] {}
      2017-04-26 18:21:47,239 - Group['kibana'] {}
      2017-04-26 18:21:47,240 - Group['users'] {}
      2017-04-26 18:21:47,240 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,242 - User['storm'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,243 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,244 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,245 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']}
      2017-04-26 18:21:47,246 - User['zeppelin'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,247 - User['metron'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,248 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,248 - User['elasticsearch'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,249 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,250 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'users']}
      2017-04-26 18:21:47,251 - User['kafka'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,252 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,253 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,254 - User['kibana'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,255 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,256 - User['hbase'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,257 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': [u'hadoop']}
      2017-04-26 18:21:47,258 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
      2017-04-26 18:21:47,261 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
      2017-04-26 18:21:47,269 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
      2017-04-26 18:21:47,270 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'create_parents': True, 'mode': 0775, 'cd_access': 'a'}
      2017-04-26 18:21:47,272 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
      2017-04-26 18:21:47,274 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
      2017-04-26 18:21:47,281 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
      2017-04-26 18:21:47,282 - Group['hdfs'] {}
      2017-04-26 18:21:47,283 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': [u'hadoop', u'hdfs']}
      2017-04-26 18:21:47,284 - FS Type: 
      2017-04-26 18:21:47,284 - Directory['/etc/hadoop'] {'mode': 0755}
      2017-04-26 18:21:47,308 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
      2017-04-26 18:21:47,310 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
      2017-04-26 18:21:47,330 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
      2017-04-26 18:21:47,341 - Skipping Execute[('setenforce', '0')] due to not_if
      2017-04-26 18:21:47,342 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'}
      2017-04-26 18:21:47,346 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'}
      2017-04-26 18:21:47,346 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'}
      2017-04-26 18:21:47,354 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
      2017-04-26 18:21:47,357 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
      2017-04-26 18:21:47,358 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
      2017-04-26 18:21:47,377 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs', 'group': 'hadoop'}
      2017-04-26 18:21:47,378 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
      2017-04-26 18:21:47,379 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
      2017-04-26 18:21:47,386 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}
      2017-04-26 18:21:47,391 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
      2017-04-26 18:21:47,682 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.3.0-37
      2017-04-26 18:21:47,684 - Checking if need to create versioned conf dir /etc/hadoop/2.5.3.0-37/0
      2017-04-26 18:21:47,687 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
      2017-04-26 18:21:47,726 - call returned (1, '/etc/hadoop/2.5.3.0-37/0 exist already', '')
      2017-04-26 18:21:47,727 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.3.0-37', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
      2017-04-26 18:21:47,766 - checked_call returned (0, '')
      2017-04-26 18:21:47,767 - Ensuring that hadoop has the correct symlink structure
      2017-04-26 18:21:47,767 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
      2017-04-26 18:21:47,771 - Create Metron Local Config Directory
      2017-04-26 18:21:47,771 - Configure Metron global.json
      2017-04-26 18:21:47,771 - Directory['/usr/metron/0.4.0/config/zookeeper'] {'owner': 'metron', 'group': 'metron', 'mode': 0755}
      2017-04-26 18:21:47,781 - File['/usr/metron/0.4.0/config/zookeeper/global.json'] {'content': InlineTemplate(...), 'owner': 'metron'}
      2017-04-26 18:21:47,786 - File['/usr/metron/0.4.0/config/zookeeper/../elasticsearch.properties'] {'content': InlineTemplate(...), 'owner': 'metron'}
      2017-04-26 18:21:47,787 - Loading config into ZooKeeper
      2017-04-26 18:21:47,787 - Execute['/usr/metron/0.4.0/bin/zk_load_configs.sh --mode PUSH -i /usr/metron/0.4.0/config/zookeeper -z y113.l42scl.hortonworks.com:2181,y114.l42scl.hortonworks.com:2181,y115.l42scl.hortonworks.com:2181'] {'path': [u'/usr/jdk64/jdk1.8.0_77/bin']}
      2017-04-26 18:21:49,396 - Calling security setup
      2017-04-26 18:21:49,397 - Restarting the parser topologies
      2017-04-26 18:21:49,397 - Stopping parsers
      2017-04-26 18:21:49,397 - Stopping bro
      2017-04-26 18:21:49,397 - Execute['storm kill bro'] {'user': 'metron'}
      2017-04-26 18:21:55,400 - Stopping snort
      2017-04-26 18:21:55,401 - Execute['storm kill snort'] {'user': 'metron'}
      2017-04-26 18:22:01,016 - Stopping yaf
      2017-04-26 18:22:01,017 - Execute['storm kill yaf'] {'user': 'metron'}
      
      Command failed after 1 tries
      

        Issue Links

          Activity

          There are no comments yet on this issue.

            People

            • Assignee:
              nickwallen Nick Allen
              Reporter:
              nickwallen Nick Allen
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development