Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1518

Corrupted input file names in old and new mlcontext apis

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • None
    • SystemML 0.14
    • None
    • None

    Description

      Both the new and old mlcontext APIs call OptimizerUtils.getUniqueTempFileName() to create HDFS filenames for registered input frames or matrices. This call simply forwards the request to Dag for consistency with hdfs filenames of intermediates and to ensure isolation with regard to concurrently running scripts (from different client processes on a shared cluster).

      However, for this code path the internal scratch space configuration is always uninitialized leading to corrupt filenames such as /_p1234_1.2.345.678//_t0/temp1_0. The missing scratch_space prefix is problematic because the remainder is interpreted as an absolute file path, often leading to permission issues because typical users are not granted write access on HFDS root.

      Note that this issue might not be immediately visible in all scenarios because it only affects input variables that are exported to HDFS (e.g., during guarded collect or as specific inputs to remote parfor).

      Attachments

        Activity

          People

            mboehm7 Matthias Boehm
            mboehm7 Matthias Boehm
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: