Hadoop Common
  1. Hadoop Common
  2. HADOOP-342

Design/Implement a tool to support archival and analysis of logfiles.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.5.0
    • Component/s: None
    • Labels:
      None

      Description

      Requirements:

      a) Create a tool support archival of logfiles (from diverse sources) in hadoop's dfs.
      b) The tool should also support analysis of the logfiles via grep/sort primitives. The tool should allow for fairly generic pattern 'grep's and let users 'sort' the matching lines (from grep) on 'columns' of their choice.

      E.g. from hadoop logs: Look for all log-lines with 'FATAL' and sort them based on timestamps (column x) and then on column y (column x, followed by column y).

      Design/Implementation:

      a) Log Archival

      Archival of logs from diverse sources can be accomplished using the distcp tool (HADOOP-341).

      b) Log analysis

      The idea is to enable users of the tool to perform analysis of logs via grep/sort primitives.

      This can be accomplished via a relatively simple Map-Reduce task where the map does the grep for the given pattern via RegexMapper and then the implicit sort (reducer) is used with a custom Comparator which performs the user-specified comparision (columns).

      The sort/grep specs can be fairly powerful by letting the user of the tool use java's in-built regex patterns (java.util.regex).

      1. logalyzer2.patch
        10 kB
        Arun C Murthy
      2. logalyzer.patch
        10 kB
        Arun C Murthy

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Unassigned
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development