[NIFI-5324] Implement syslog record readers - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Implemented
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Creating this Jira based on discussion with ottobackwards in the NiFi HipChat room...

We currently have ListenSyslog with optional parsing when batch size is 1, and ParseSyslog which also assumes 1 message per flow file. There is also ListenTCPRecord and ListenUDPRecord which can be used with a GrokReader to read log messages from the respective network connections.

The common scenario for wanting to parse the syslog messages is to extract a field from the syslog message into an attribute and then use the attribute to make decisions like routing/filtering.

Since the "1 message per flow file" pattern is generally something we try to avoid, it would be nice if we could keep batches of syslog messages together in a single flow file and then use record processors to process the batches.

For example, if we had a syslog record reader we could then use PartitionRecord to divide a flow file of many syslog records into smaller groups based on some field in the message, each group can then be routed somewhere based on the group value.

Another example would be to use QueryRecord to run a SQL query that selects specify syslog messages based on a field in the message.

It would also make it easy to convert syslog messages to a structured format using ConvertRecord with a syslog reader and a writer like JSON or Avro.

We would likely want two syslog record readers, one for each of the RFC formats.

One aspect to consider is related to the schema used/produced by the reader... typically the readers/writers have a "Schema Access Strategy" where they can obtain a schema from a schema registry, or from flow file attributes, or something specific to the format like an embedded Avro schema.

In this case, the schema is somewhat pre-determined by the specific syslog reader because the schema can only be at-most the fields produced by the reader parsing the messages. So this may be a case where there is no schema access strategy, and there are per-determined schemas. It is sort of like the GrokReader where it creates a schema from the named fields in the expression, except in this case there is no user defined expression, and the named fields are dictated by the parser.

We may need to reuse syslog related code that is in nifi-standard-processors, so it might require moving that code to nifi-processor-utils, or creating a new nifi-syslog-utils module.

Attachments

Issue Links

is related to

NIFI-5325 Need a Syslog Parser that fully supports the 5424 Spec

Resolved

NIFI-5139 ListenSyslog should process Structured Data

Resolved

Sub-Tasks

1.	Add Syslog 5424 Record Reader and create nifi-syslog-utils		Resolved	Otto Fowler
2.	Add Syslog Record Reader legacy Syslog		Resolved	Otto Fowler

Activity

People

Assignee:: Otto Fowler

Reporter:: Bryan Bende

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 20/Jun/18 14:31

Updated:: 17/Jul/18 19:19

Resolved:: 17/Jul/18 19:19