Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-677

Random data generator for decision tree fails w/ data type mismatch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • SystemML 0.9
    • SystemML 0.10
    • None
    • None

    Description

      The data generator for decision tree is composed of a shell script that calls two dml scripts in order to apply the file-based transform (which requires an existing file during compilation) in the second script. However, there is a data type mismatch as the first script outputs a matrix and the second script expects a frame.

      This task covers (1) a script level change to output a frame from the first script, and (2) a fix for writing the frame meta data file with a value type accepted by the subsequent transform.

      Note that the script level change already exploits matrix-frame casting which has been introduced as part of SYSTEMML-554 but this builtin function is as of today only supported in CP. This means, the data generator only works for small data that fits into the driver memory. Once the Spark/MR converters from SYSTEMML-560 are fully integrated, the script will runs for large data too without further script changes.

      Attachments

        Issue Links

          Activity

            People

              mboehm7 Matthias Boehm
              mboehm7 Matthias Boehm
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: