Uploaded image for project: 'Apache MetaModel (Retired)'
  1. Apache MetaModel (Retired)
  2. METAMODEL-1086

Encoding not used with InputStreams in CsvDataContext

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.5.2
    • 5.0.0
    • None

    Description

      When using the Constructor with InputStreams you can get into trouble with encoding if the default encoding of your platform is different than the one used in the InputStream even though you specify an encoding in the CvsConfiguration.

      CsvDataContext csvDataContext = new CsvDataContext(someInputstream, new CsvConfiguration(1, "utf-8", ';', '"', '\\'));
      

      The offending code is in the static method createFileFromInputStream():

      private static File createFileFromInputStream(InputStream inputStream, String encoding) {
              ....
              final BufferedWriter writer = FileHelper.getBufferedWriter(file, encoding);
              final BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
              ....
      

      The InputStreamReader is instantiated without a charset. In this case the Platforms default charset is used (e.g. "windows-1252"). The BufferedWriter on the other hand is instantiated with the specified charset. This effectively causes a re-encoding if the file is in a different encoding (e.g. "utf-8") than the platforms default encoding when the content of the stream is written to the temp directory.

      Instead the code should be similar to this:

      private static File createFileFromInputStream(InputStream inputStream, String encoding) {
              ....
              final BufferedWriter writer = FileHelper.getBufferedWriter(file, encoding);
              final BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream, encoding));
              ....
      

      On the other hand you can skip the encoding completely when copying the InputStream. The encoding is used later when the FileResource is read. An alternative and more readable implementation in Java 7 would be:

                  tempFile = File.createTempFile("metamodel", ".csv");
                  tempFile.deleteOnExit();
                  Files.copy(resourceAsStream, tempFile.toPath(), StandardCopyOption.REPLACE_EXISTING);
                  return tempfile;
      

      Attachments

        Issue Links

          Activity

            People

              kaspersor Kasper Sørensen
              samuelmumm Samuel Mumm
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: