Uploaded image for project: 'Chukwa'
  1. Chukwa
  2. CHUKWA-472

TsProcessor: make date format configurable

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Release Note:
      TsProcessor time format is configurable.

      Description

      The TsProcessor's default date format and it's date format for a given data type should both be configurable.

      • To set time format for a given data type:
        <property>
         <name>TsProcessor.time.format.some_data_type</name>
         <value>yyyy-MM-dd HH:mm:ss,SSS</value>
        </property>
        
      • To set the default time format:
        <property>
         <name>TsProcessor.default.time.format</name>
         <value>yyyy-MM-dd HH:mm:ss,SSS</value>
        </property>
        
      1. ASF.LICENSE.NOT.GRANTED--CHUKWA-472.1.patch
        7 kB
        Bill Graham
      2. CHUKWA-472.2.patch
        12 kB
        Bill Graham

        Issue Links

          Activity

          Hide
          billgraham Bill Graham added a comment -

          Attaching CHUKWA-472.1.patch. This patch requires CHUKWA-471.patch to be applied first.

          Show
          billgraham Bill Graham added a comment - Attaching CHUKWA-472 .1.patch. This patch requires CHUKWA-471 .patch to be applied first.
          Hide
          billgraham Bill Graham added a comment -

          Canceling this patch, since I want to add one more bit of functionality that I think will be useful. The current implementation expects the date to be the first set of characters in the record. In some cases (i.e. Apache logs) that's not the case. Adding the ability to optionally specify a regular expression to locate where the date string exists in the record.

          For a record like this for example, you could use configs like below:

          10.10.182.49 [22/Apr/2010:15:07:27 -0700] "" 200 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" "some.site.com:8076"

            <property>
             <name>TsProcessor.time.regex.some_data_type</name>
             <value>^(?:[\\d.]+) \\[(\\d{2}/\\w{3}/\\d{4}:\\d{2}:\\d{2}:\\d{2} [-+]\\d{4})\\] .*</value>
            </property>
          
            <property>
             <name>TsProcessor.default.time.regex</name>
             <value>^(?:[\\d.]+) \\[(\\d{2}/\\w{3}/\\d{4}:\\d{2}:\\d{2}:\\d{2} [-+]\\d{4})\\] .*</value>
            </property>
          
          Show
          billgraham Bill Graham added a comment - Canceling this patch, since I want to add one more bit of functionality that I think will be useful. The current implementation expects the date to be the first set of characters in the record. In some cases (i.e. Apache logs) that's not the case. Adding the ability to optionally specify a regular expression to locate where the date string exists in the record. For a record like this for example, you could use configs like below: 10.10.182.49 [22/Apr/2010:15:07:27 -0700] "" 200 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3" "some.site.com:8076" <property> <name>TsProcessor.time.regex.some_data_type</name> <value>^(?:[\\d.]+) \\[(\\d{2}/\\w{3}/\\d{4}:\\d{2}:\\d{2}:\\d{2} [-+]\\d{4})\\] .*</value> </property> <property> <name>TsProcessor.default.time.regex</name> <value>^(?:[\\d.]+) \\[(\\d{2}/\\w{3}/\\d{4}:\\d{2}:\\d{2}:\\d{2} [-+]\\d{4})\\] .*</value> </property>
          Hide
          billgraham Bill Graham added a comment -

          Attaching CHUKWA-472.2.patch, which implements the add'l functionality described above.

          Show
          billgraham Bill Graham added a comment - Attaching CHUKWA-472 .2.patch, which implements the add'l functionality described above.
          Hide
          asrabkin Ari Rabkin added a comment -

          +1 will commit this weekend barring objection.

          Show
          asrabkin Ari Rabkin added a comment - +1 will commit this weekend barring objection.
          Hide
          eyang Eric Yang added a comment -

          I just committed this, thanks Bill.

          Show
          eyang Eric Yang added a comment - I just committed this, thanks Bill.
          Hide
          billgraham Bill Graham added a comment -

          Thanks Eric.

          FYI, for anyone with a tendency to copy-and-paste, the sample configuration regex values shown above should only have single-backslash-escapes when used in configuration files (the double-escapes were taken from values in java code).

          Show
          billgraham Bill Graham added a comment - Thanks Eric. FYI, for anyone with a tendency to copy-and-paste, the sample configuration regex values shown above should only have single-backslash-escapes when used in configuration files (the double-escapes were taken from values in java code).

            People

            • Assignee:
              billgraham Bill Graham
              Reporter:
              billgraham Bill Graham
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development