Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-291

Hadoop Log Archiver/Analyzer utility

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • util
    • None

    Description

      Overview of the log archiver/analyzer utility...

      1. Input
      The tool takes as input a list of directory URLs, each url could also we associated with a file-pattern to specify what pattern of files in that directory are to be used.
      e.g. http://g1015:50030/logs/hadoop-sameer-jobtracker-*
      file:///export/crawlspace/sanjay/hadoop/trunk/run/logs/haddop-sanjay-namenode-* (local disk on the machine on which the job was submitted)

      2. The tool supports 2 main functions:

      a) Archival
      Archive the logs in the DFS in the following hierarchy:
      /users/<username>/log-archive/YYYY/mm/dd/HHMMSS.log by default
      Or a user-specified directory and then:
      <input-dir>/YYYY/mm/dd/HHMMSS.log

      b) Processing with simple sort/grep primitives
      Archive the logs as above and then grep for lines with given pattern (e.g. INFO) and then sort with spec e.g. <logger><level><date>. (Note: This is proposed with current log4j based logging in mind... do we need anything more generic?). The sort/grep specs are user-provided; along with directory URLs.

      3. Thoughts on implementation...

      a) Archival
      Current idea is to put a .jsp page (src/webapps) on each of the nodes; which then does a copyFromLocal of the log-file into the DFS. The jobtracker will fire n map-tasks which only hit the jsp page as per the directory URLs. The reduce-task is a no-op and only collects statistics on failures (if any).

      b) Processing with sort/grep
      Here, the tool first archives the files as above and then another set of map-reduce tasks will do the sort/grep on the files in DFS with given specs.

      • * - * -

      Suggestions/corrections welcome...

      thanks,
      Arun

      Attachments

        Activity

          People

            Unassigned Unassigned
            acmurthy Arun Murthy
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: