Mahout
  1. Mahout
  2. MAHOUT-834

rowsimilarityjob doesn't clean it's temp dir, and fails when seeing it again

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6, 0.7
    • Fix Version/s: 0.7
    • Component/s: Integration
    • Labels:
      None

      Description

      If I do this:

      mahout rowsimilarity --input matrixified/matrix --output sims/ --numberOfColumns 27684 --similarityClassname SIMILARITY_LOGLIKELIHOOD --excludeSelfSimilarity

      then clean my output and rerun,

      rm -rf sims/ # (though this step doesn't even seem needed)

      then try again:

      mahout rowsimilarity --input matrixified/matrix --output sims/ --numberOfColumns 27684 --similarityClassname SIMILARITY_LOGLIKELIHOOD --excludeSelfSimilarity

      The temp files left from the first run make a re-run impossible - we get: "Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory temp/weights already exists".

      Manually deleting the temp directory fixes this.

      I get same behaviour if I explicitly pass in a --tempdir path, e.g.:

      mahout rowsimilarity --input matrixified/matrix --output sims/ --numberOfColumns 27684 --similarityClassname SIMILARITY_LOGLIKELIHOOD --excludeSelfSimilarity --tempDir tmp2/

      Presumably something like HadoopUtil.delete(getConf(),tempDirPath) is needed somewhere? (and maybe --overwrite too ?)

      1. Mahout-834.patch
        2 kB
        Suneel Marthi
      2. Mahout-834.patch
        2 kB
        Suneel Marthi

        Activity

        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Sebastian Schelter made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Suneel Marthi made changes -
        Attachment Mahout-834.patch [ 12525838 ]
        Suneel Marthi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.7 [ 12319261 ]
        Suneel Marthi made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Affects Version/s 0.7 [ 12319261 ]
        Assignee Grant Ingersoll [ gsingers ] Suneel Marthi [ smarthi ]
        Suneel Marthi made changes -
        Assignee Suneel Marthi [ smarthi ] Grant Ingersoll [ gsingers ]
        Suneel Marthi made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Affects Version/s 0.6 [ 12316364 ]
        Suneel Marthi made changes -
        Attachment Mahout-834.patch [ 12512478 ]
        Grant Ingersoll made changes -
        Field Original Value New Value
        Assignee Suneel Marthi [ smarthi ]
        Dan Brickley created issue -

          People

          • Assignee:
            Suneel Marthi
            Reporter:
            Dan Brickley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development