Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-21810

Create Utility Script to support Solr Collection Data Retention/Purging/Archiving

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.6.0
    • 2.6.0
    • ambari-infra
    • None

    Description

      In Ambari 3.0, LogSearch will include more fully-featured support in this area, but this current script will be used in Ambari 2.6.0, as a way to simplify the customer's use cases in the areas of log data retention, log purging, and log archiving.

      The script solrDataManager.py (which is located inside /usr/lib/ambari-infra-solr-client folder) accepts a mode parameter, which may be delete or save. In both cases the user may specify the filter field, an end value, or the number of days to keep, and potentially kerberos keytab/principal for solr. In case of "save" mode the user should specify either arguments for HDFS, S3, or a local path to save to. The user may also specify the size of the read block ( documents returned by one solr query ) and the write block ( documents in an output file )

      Examples:

      Save data from the solr collection hadoop_logs accessible at http://c6401.ambari.apache.org:8886/solr based on the field logtime, save everything older than 1 day, read 10 documents at once, write 100 documents into a file, and copy the zip files into the local directory /tmp. Do this in verbose mode:

      /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 1 -r 10 -w 100 -x /tmp -v
      

      Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, fetching data from a kerberized Solr:

      /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 3 -r 10 -w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab -n infra-solr/c6401.ambari.apache.org@AMBARI.APACHE.ORG -u hdfs -p /
      

      Delete the data before 2017-08-29T12:00:00.000Z:

      /usr/bin/python solrDataManager.py -m delete -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -e 2017-08-29T12:00:00.000Z
      

      Attachments

        1. AMBARI-21810.patch
          27 kB
          Miklos Gergely

        Issue Links

          Activity

            People

              mgergely Miklos Gergely
              mgergely Miklos Gergely
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: