Uploaded image for project: 'Commons CSV'
  1. Commons CSV
  2. CSV-226

Add CSVParser test case for standard charsets

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Test
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.5
    • None
    • Parser
    • None

    Description

      Hello, I'd like to contribute a CSVParser test suite for standard charsets as defined in java.nio.charset.StandardCharsets + UTF-32.

      This is a standalone test but is also in support of a fix for CSV-107.  It also refactors and unifies the testing around your established workaround of inserting BOMInputStream ahead of the CSVParser.

      It will take a single base UTF-8 encoded file (cstest.csv) and copy it to multiple output files (in target dir) with differing character sets, similar to the iconv tool.  Each file will then be fed into the parser to test all the BOM/NOBOM unicode variants.  I think a file based approach is still important here rather than just encoding a character stream inline as a string, that way if issues develop it's easy to inspect the data.

      I noticed in the project’s pom.xml (rat config) that you are excluding individual test resource files by name rather than using a wildcard expression to exclude every file in the directory.  Is there a reason for this? It’s much better if devs do not have to maintain this configuration.

      i.e.: switch over to a single exclude expression
      <exclude>src/test/resources/**/*</exclude>
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            aeschwabe Anson Schwabecher

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m

                Slack

                  Issue deployment