Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The current GrokReader implementation cannot handle complex expressions like in the following scenario:
Suppose we have a custom Grok pattern file:
SYSLOGBASE_ISO8601 %{TIMESTAMP_ISO8601:timestamp} (?:%{SYSLOGFACILITY} )?%{SYSLOGHOST:logsource} %{SYSLOGPROG}: LINE_1 %{SYSLOGBASE}%{GREEDYDATA:message} LINE_2 %{SYSLOGBASE_ISO8601}%{GREEDYDATA:message} LINE (?:%{LINE_1}|%{LINE_2})
If we set the Grok expression to:
%LINE
the service will fail for 2 reasons:
- LINE_1 and LINE_2 define the same labels. The service will try to create a schema by adding fields for all labels encountered. This leads to duplicate fields in the schema which is not allowed.
- When the used Grok library reads a record based on a complex expression it returns an array as a value as the complex expression can have multiple matches. NiFi in turn tries to handle it as a byte array.
Attachments
Attachments
Issue Links
- links to