Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-21810

Create Utility Script to support Solr Collection Data Retention/Purging/Archiving

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.6.0
    • 2.6.0
    • ambari-infra
    • None

    Description

      In Ambari 3.0, LogSearch will include more fully-featured support in this area, but this current script will be used in Ambari 2.6.0, as a way to simplify the customer's use cases in the areas of log data retention, log purging, and log archiving.

      The script solrDataManager.py (which is located inside /usr/lib/ambari-infra-solr-client folder) accepts a mode parameter, which may be delete or save. In both cases the user may specify the filter field, an end value, or the number of days to keep, and potentially kerberos keytab/principal for solr. In case of "save" mode the user should specify either arguments for HDFS, S3, or a local path to save to. The user may also specify the size of the read block ( documents returned by one solr query ) and the write block ( documents in an output file )

      Examples:

      Save data from the solr collection hadoop_logs accessible at http://c6401.ambari.apache.org:8886/solr based on the field logtime, save everything older than 1 day, read 10 documents at once, write 100 documents into a file, and copy the zip files into the local directory /tmp. Do this in verbose mode:

      /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 1 -r 10 -w 100 -x /tmp -v
      

      Save the last 3 days of hadoop_logs into HDFS path "/" with the user hdfs, fetching data from a kerberized Solr:

      /usr/bin/python solrDataManager.py -m save -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -d 3 -r 10 -w 100 -k /etc/security/keytabs/ambari-infra-solr.service.keytab -n infra-solr/c6401.ambari.apache.org@AMBARI.APACHE.ORG -u hdfs -p /
      

      Delete the data before 2017-08-29T12:00:00.000Z:

      /usr/bin/python solrDataManager.py -m delete -s http://c6401.ambari.apache.org:8886/solr -c hadoop_logs -f logtime -e 2017-08-29T12:00:00.000Z
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mgergely Miklos Gergely
            mgergely Miklos Gergely
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment