[NIFI-4822] ValidateRecord does not maintain order of CSV records - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.4.0, 1.5.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

If you have ValidateRecord configured with a CSV reader and CSV writer and send in some valid data, the flow file is routed to "valid", but the columns are written out in a different order than there were read.

This means if the next processor is another record-oriented processor using the exact same schema and reader, it will fail to read it because the first column won't be what it expects.

From doing some digging, it appears that in WriteCsvResult there is a method getFieldNames() that does this:

final Set<String> allFields = new LinkedHashSet<>();
allFields.addAll(record.getRawFieldNames());
allFields.addAll(recordSchema.getFieldNames());

In this case, record.getRawFieldNames() is coming from the keyset of a HashMap which means it is not maintaining the order the fields were read in.

CsvRecordReader line 97:

final Map<String, Object> values = new HashMap<>(recordFields.size() * 2);

Attachments

Issue Links

is part of

NIFI-4955 ValidateRecord does not preserve columns ordering with CSV

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Bryan Bende

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/Jan/18 20:56

Updated:: 13/Jun/18 06:22

Resolved:: 13/Jun/18 06:22