Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4955

Log Parser for Drill

    XMLWordPrintableJSON

    Details

    • Flags:
      Patch

      Description

      I've been experimenting with a generic log parser for Drill. The basic concept is that if you wanted Drill to ingest log files such as this MySQL log:

      070823 21:00:32       1 Connect     root@localhost on test1
      070823 21:00:48       1 Query       show tables
      070823 21:00:56       1 Query       select * from category
      070917 16:29:01      21 Query       select * from location
      070917 16:29:12      21 Query       select * from location where id = 1 LIMIT 1
      

      You probably could do it with the various string manipulation methods such as split, substring etc. but you'd end up with some ugly and very complex queries.

      The extension I've built allows you to supply Drill with a regex for the formatting and a list of fields as shown below.

      "log": {
            "type": "log",
            "extensions": [
              "log"
            ],
            "fieldNames": [
              "date",
              "time",
              "pid",
              "action",
              "query"
            ],
            "pattern": "(\\d{6})\\s(\\d{2}:\\d{2}:\\d{2})\\s+(\\d+)\\s(\\w+)\\s+(.+)"
          }
      

      You can then query this log files in this format in Drill. I'd like to submit this for inclusion in Drill if there is interest.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              cgivre Charles Givre
              Reviewer:
              Parth Chandra
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: