Commons CSV
  1. Commons CSV
  2. CSV-53

Allow to always enclose printed values into quotes

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0
    • Component/s: Printer
    • Labels:
      None

      Description

      The printer encloses the values into quotes only if it contains a special character like a field separator or a line separator. However some applications expect the values to always be enclosed into quotes. It's necassary to be able to control how the quotes are added, either always or as needed.

        Activity

        Hide
        Gary Gregory added a comment -

        Resolving. See the Quote enum.

        Show
        Gary Gregory added a comment - Resolving. See the Quote enum.
        Hide
        Gary Gregory added a comment -

        Are we happy with the current impl?
        See org.apache.commons.csv.Quote.

        Show
        Gary Gregory added a comment - Are we happy with the current impl? See org.apache.commons.csv.Quote.
        Hide
        Sebb added a comment -

        If the user selects QUOTE_NONE then the only options for printing a value which contains a delimiter or EOL - or which starts with the quote character - are:

        • escape the character(s)
        • throw an Exception

        In which case, we just use the escape character setting to determine which to do.

        [Otherwise the printed output will not be valid, and I don't think we should allow that.]

        If the user wants to use a different escape setting for parsing and printing, then they can just create a different format.
        Otherwise we are back into having separate parsing and printing CSVFormat classes, which you already said you did not want.

        Show
        Sebb added a comment - If the user selects QUOTE_NONE then the only options for printing a value which contains a delimiter or EOL - or which starts with the quote character - are: escape the character(s) throw an Exception In which case, we just use the escape character setting to determine which to do. [Otherwise the printed output will not be valid, and I don't think we should allow that.] If the user wants to use a different escape setting for parsing and printing, then they can just create a different format. Otherwise we are back into having separate parsing and printing CSVFormat classes, which you already said you did not want.
        Hide
        Emmanuel Bourg added a comment -

        I agree that numeric fields are not to be converted to floats by the parser, my comment was referring to the printer.

        The escaping might be specified by another enum to select between quote doubling, quote escaping and no escaping.

        Show
        Emmanuel Bourg added a comment - I agree that numeric fields are not to be converted to floats by the parser, my comment was referring to the printer. The escaping might be specified by another enum to select between quote doubling, quote escaping and no escaping.
        Hide
        Sebb added a comment -

        +1 to including an OutputQuoting setting along those lines, e.g. an Enum.
        Though I don't think we should convert numeric fields to type float!

        QUOTE_NONE cannot be allowed to generate invalid output, so must either escape or throw an error.
        So I don't understand what you mean by "a separate parameter".

        Show
        Sebb added a comment - +1 to including an OutputQuoting setting along those lines, e.g. an Enum. Though I don't think we should convert numeric fields to type float! QUOTE_NONE cannot be allowed to generate invalid output, so must either escape or throw an error. So I don't understand what you mean by "a separate parameter".
        Hide
        Emmanuel Bourg added a comment -

        The Python CSV module has a similar concept, it offers 4 modes to control the quoting behavior:

        http://docs.python.org/library/csv.html

        • csv.QUOTE_ALL
          Instructs writer objects to quote all fields.
        • csv.QUOTE_MINIMAL
          Instructs writer objects to only quote those fields which contain special characters such as delimiter, quotechar or any of the characters in lineterminator.
        • csv.QUOTE_NONNUMERIC
          Instructs writer objects to quote all non-numeric fields.
          Instructs the reader to convert all non-quoted fields to type float.
        • csv.QUOTE_NONE
          Instructs writer objects to never quote fields. When the current delimiter occurs in output data it is preceded by the current escapechar character. If escapechar is not set, the writer will raise Error if any characters that require escaping are encountered.
          Instructs reader to perform no special processing of quote characters.

        The idea to quote only non numeric values is interesting, I wonder if this helps Excel to select the right data type. However I wouldn't mix the quoting behavior with the escaping behavior as specified by QUOTE_NONE, this is probably best handled by a separate parameter.

        Show
        Emmanuel Bourg added a comment - The Python CSV module has a similar concept, it offers 4 modes to control the quoting behavior: http://docs.python.org/library/csv.html csv.QUOTE_ALL Instructs writer objects to quote all fields. csv.QUOTE_MINIMAL Instructs writer objects to only quote those fields which contain special characters such as delimiter, quotechar or any of the characters in lineterminator. csv.QUOTE_NONNUMERIC Instructs writer objects to quote all non-numeric fields. Instructs the reader to convert all non-quoted fields to type float. csv.QUOTE_NONE Instructs writer objects to never quote fields. When the current delimiter occurs in output data it is preceded by the current escapechar character. If escapechar is not set, the writer will raise Error if any characters that require escaping are encountered. Instructs reader to perform no special processing of quote characters. The idea to quote only non numeric values is interesting, I wonder if this helps Excel to select the right data type. However I wouldn't mix the quoting behavior with the escaping behavior as specified by QUOTE_NONE, this is probably best handled by a separate parameter.

          People

          • Assignee:
            Unassigned
            Reporter:
            Emmanuel Bourg
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development