Description
StandardRecordWriter.writeRecord()[1] uses DataOutputStream.writeUTF()[2] without checking the length of the value to be written. If this length is greater than 65535 (2^16 - 1), you get a UTFDataFormatException "encoded string too long..."[3]. Ultimately, this can result in an IllegalStateException[4], bringing a halt to the data flow causing PersistentProvenanceRepository "Unable to merge <prov_journal> with other Journal Files due to..." WARNings.
Several of the field values being written in this way are pre-defined, and thus not likely an issue. However, the "details" field can be populated by a processor, and can be of an arbitrary length. Additionally, if the detail filed is indexed (which I didn't investigate, but I'm sure is easy enough to determine), then the length might be subject to the Lucene limit discussed in .NIFI-2787
[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/StandardRecordWriter.java#L163-L173
[2] http://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html#writeUTF%28java.lang.String%29
[3] http://stackoverflow.com/questions/22741556/dataoutputstream-purpose-of-the-encoded-string-too-long-restriction
[4] https://github.com/apache/nifi/blob/5fd4a55791da27fdba577636ac985a294618328a/nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/PersistentProvenanceRepository.java#L754-L755
Attachments
Issue Links
- relates to
-
NIFI-3389 FlowFileSchema writes attribute name and value as STRING instead of LONG_STRING
- Resolved
- links to