Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1339

Incorrect handling of tables with custom delimiter when their data contain '|'

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      With the table data

      1;a;1.1
      2;a|b;2.4
      3;b|c|d;3.2
      

      and external table declaration

      create external table delimiter (id int, name text, score float) using csv
      with ('csvfile.delimiter'=';') location 'xxx';
      

      , I got the following incorrect query result for query 'select name, score from delimiter'

      name,score
      -------------------------------
      a,1.1
      a,null
      b,null
      

      It looks like '|' in name column is recognized as delimiter.
      As I inspect the code,
      table meta information like 'csvfile.delimiter' is only valid on leaf scan operation and all the other operations (including making intermediate data and materialize query result) assumes that delimiter is DEFAULT_FIELD_DELIMITER, which is '|'.
      Hence, if the plan has the process of making intermediate data,
      it handles '|' in the data as a delimiter even though it is not actually.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hyunsik Hyunsik Choi
                Reporter:
                sirpkt Keuntae Park
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: