Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-666

DistributedSparseMatrix should clean up after itself when doing times(Vector) and timesSquared(Vector)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.5
    • Fix Version/s: 0.5
    • Component/s: Math
    • Labels:
      None
    • Environment:

      Linux x86_64 2.6.18, Mac OS 10.6 64-bit, Hadoop 0.20.2, Java 1.6

      Description

      The directories created during the times() and timesSquared() methods in DistributedSparseMatrix leave behind a lot of cruft. While the individual files are tagged with deleteOnExit, but the directories are not. Also, but not deleting them until JVM exit, a job that does repeated matrix/vector multiplies, like DistributedLanczosSolver, creates a lot of temp files that stick around for the whole run, even though the results they contain are read once and then never again.

      Our cluster admins enforce both file count and size quotas, so since 5 temp files/directories are created on each iteration of DistributedLanczosSolver, we're constantly bumping into the quota with large SVDs.

        Attachments

        1. mahout-666.patch
          6 kB
          Jonathan Traupman
        2. mahout-666.patch
          7 kB
          Jonathan Traupman

          Activity

            People

            • Assignee:
              jake.mannix Jake Mannix
              Reporter:
              jtraupman Jonathan Traupman
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 4h
                4h
                Remaining:
                Remaining Estimate - 4h
                4h
                Logged:
                Time Spent - Not Specified
                Not Specified