Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3178

csv reader should allow newlines inside quotes

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.9.0
    • Storage - Text & CSV
    • None
    • Ubuntu Trusty 14.04.2 LTS

    Description

      When reading a csv file which contains newlines within quoted strings, e.g. via

      select * from dfs.`/tmp/q.csv`;

      Drill 1.0 says:

      Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException: Error processing input: Cannot use newline character within quoted string

      But many tools produce csv files with newlines in quoted strings. Drill should be able to handle them.

      Workaround: the csvquote program (https://github.com/dbro/csvquote) can encode embedded commas and newlines, and even decode them later if desired.

      Attachments

        1. drill-3178.patch
          8 kB
          F Méthot

        Issue Links

          Activity

            People

              fmethot F Méthot
              nealmcb Neal McBurnett
              Krystal Krystal
              Votes:
              5 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: