Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.4
-
None
-
None
Description
When trying to read the file faulty.csv and parse it I get the following error:
java.io.IOException: (line 1) invalid char between encapsulated token and delimiter
at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:500)
at org.apache.commons.csv.CSVParser.initializeHeader(CSVParser.java:389)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:284)
at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:252)
at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:846)
The line of code is the parsing part returning the iterator of it:
csvFormat = CSVFormat.DEFAULT.withHeader().withDelimiter(';').withIgnoreHeaderCase();
iterator = csvFormat.parse(reader).iterator();
The invalid char is the contained SOH and STX non printable characters at the end of line.
I debugged through the source of this and ran into the Exception in the Lexer not handling these special characters
Unfortunately I'm not able to provide some hints on fixing this as I'm not familiar with these type of characters and what behaviour they should have.
Sincerely
Attachments
Attachments
Issue Links
- is related to
-
IO-577 Add readers to filter out given characters: CharacterSetFilterReader and CharacterFilterReader.
- Resolved