Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-2928

CSV parsing with non-default delimiter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • SystemDS 2.1
    • None
    • None

    Description

      After delimiter modification of src/test/scripts/functions/io/csv/in/transfusion_2.single.csv from default delimiter "," to semicolon ";" DML tests failed beacuse reader replaces semicolon with default delimiter and then split and double parsing fail. 

      Exception : class java.io.IOException
      Message : Read task for csv input failed: java.lang.NumberFormatException: For input string: "2 ,50,12500,98 ,1"
      4 > org.apache.sysds.runtime.io.ReaderTextCSVParallel.readCSVMatrixFromHDFS(ReaderTextCSVParallel.java:157)
      4 > org.apache.sysds.runtime.io.ReaderTextCSVParallel.readMatrixFromHDFS(ReaderTextCSVParallel.java:102)

      To test I modified src/test/scripts/functions/io/csv/ReadCSVTest_2.dml, src/test/scripts/functions/io/csv/csv_verify2.R and src/test/java/org/apache/sysds/test/functions/io/csv/ReadCSVTest.java 

      Attachments

        1. csv_verify2.R
          1 kB
          Olga Ovcharenko
        2. ReadCSVTest_2.dml
          1 kB
          Olga Ovcharenko
        3. ReadCSVTest.java
          3 kB
          Olga Ovcharenko

        Activity

          People

            olga_o Olga Ovcharenko
            olga_o Olga Ovcharenko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: