Uploaded image for project: 'Metron (Retired)'
  1. Metron (Retired)
  2. METRON-865

Additional Mpack bug fixes and improvements, that affect Ambari database

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Done
    • Major
    • Resolution: Done
    • 0.3.0
    • 0.4.0
    • None
    • Centos7

    Description

      Multiple bug fixes and recommended improvements were found in the course of implementing METRON-608 that are unrelated to METRON-609 (singlenode install). Almost all items relate to Elasticsearch.

      About half the fixes did not impact the Ambari database, and were done in METRON-634 (PR#532).
      This jira provides the work items for changes that do impact the Ambari database, and should therefore be done with an Mpack version bump and database upgrade script. Implementation of these may be seen in the closed PR https://github.com/apache/metron/pull/425

      Affects Ambari database, needs db upgrade script:

      • status_params.py, which redundantly defines pid_dir as a python variable, is unnecessary and unused by the ES portion of the Mpack. It can be removed.
      • pid_dir SHOULD be specified in elastic-sysconfig.xml, rather than elastic-env.xml, as it is a parameter that must be provided to ES at launch-time, but is not something there's any reason for the admin to change in usual circumstances.
      • conf_dir SHOULD be specified in elastic-env.xml or elastic-site.xml, not in elastic-sysconfig.xml. While it too is a parameter that must be provided to ES at launch-time, it is typically left to the installing admin where to put the config files.
      • The configuration parameter names in elastic-site.xml should be improved in several instances to make the semantics more obvious to the human reader (who may not be real familiar with Elasticsearch configuration). Mouse-over documentation will continue to provide the ES config parameter equivalents. In particular, suggest:
        cluster_name -> es_cluster_name  (to distinguish ES cluster from Stack cluster)
        zen_discovery_ping_unicast_hosts -> es_cluster_hosts
        network_host -> network_bindings  (these are in fact interface names, not host names)
        
      • "data_dir" apparently should be eliminated (from elastic-sysconfig) in preference for "path_data" (in elastic-site.xml). The latter value ends up overriding the former anyway, but the existence of the former is confusing and unnecessary.
      • All four configuration parameters in elastic-env.xml should be moved to elastic-site.xml, because they are all reasonable to set in a "site.xml" file and do end up in the .yml file that ES uses instead of "site.xml", and do NOT end up in environment variables. The only parameters that end up in env vars are set in elastic-sysconfig, and the ES launch process in fact ignores the elastic-env.sh file that is templated in elastic-env.xml (which consists only of JAVA_HOME and PATH). Therefore we could also eliminate elastic-env.sh and hence entirely remove elastic-env.xml, or we could choose to keep the small elastic-env.sh file and its template, just to remind people that it is necessary to have JAVA_HOME defined.
      • In METRON/0.3.0/configuration/metron-env.xml and METRON/0.3.0/package/scripts/params/params_linux.py, the value "metron_apps_indexed_hdfs_dir" does not need to be settable by admin; it is appropriate to require it to be subordinate to "metron_apps_hdfs_dir". Thus it can be removed from metron-env.xml and set to
        "{metron_apps_hdfs_dir}/indexing/indexed" in params_linux.py. This also eliminates a really unacceptable use of "double format".

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mattf Matthew Foley
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: