Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.8.0
-
None
Description
As defined by the CharsetEncoder documentation the encoding operation consists of:
- (reset)
- encode
- flush
However, org.apache.commons.io.input.ReaderInputStream does not call flush. This leads to incorrect results for charsets whose flush method appends additional bytes.
Example:
// Charset whose CharsetEncoder.flush(...) puts bytes Charset charset = Charset.forName("Cp930"); // \u0391: Causes CharsetEncoder.flush(...) to put additional bytes String s = "\u0391"; byte[] expected = s.getBytes(charset); byte[] actual; try (InputStream in = new ReaderInputStream(new StringReader(s), charset)) { actual = IOUtils.toByteArray(in); } if (!Arrays.equals(expected, actual)) { throw new AssertionError("\n Expected: " + Arrays.toString(expected) + "\n Actual: " + Arrays.toString(actual)); }
Also make sure to check the result of flush() because OVERFLOW as result is possible. In theory isError() == true might be possible as well, but I don't think any of charset implementations currently return that.
Attachments
Attachments
Issue Links
- is related to
-
IO-780 ReaderInputStream discards some encoding errors
- Open