Uploaded image for project: 'Commons CSV'
  1. Commons CSV
  2. CSV-227

first column always quoting when multilingual language, when not on second column

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.5
    • None
    • Parser
    • None

    Description

      when including multilingual  character (utf-8 encoding),

      CSVPrinter always quote only first column, not other columns.

       

      //  example code
      CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
      
      CSVPrinter printer = new CSVPrinter(System.out, format);
      
      List<String[]> temp = new ArrayList<String[]>();
      
      temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
      temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
      temp.add(new String[] { "", "hello4", "", "test4" });
      
      for (String[] temp1 : temp) {
      printer.printRecord(temp1);
      }
      printer.close();
      

       

      result =>

      "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
      "한글3",hello3,3한글3,test3
      "",hello4,,test4

       

      i found the code.

      multilingual charaters are out of  0x7E. first record and multilinguage  always print quotes.

        

      // CSVFormat.class
      ...
      1173: char c = value.charAt(pos);
      1174: 
      1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / %x23-2B / %x2D-7E
      1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 0x2D || c > 0x7E)) {
      1177: quote = true;
      1178: } else if (c <= COMMENT) {
      ...

       

      would you fix this bug?

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Trichotomy Jisun, Shin
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: