Mahout
  1. Mahout
  2. MAHOUT-395

Using KMeansDriver leaves open files and can lead to FileNotFoundException - "too many open files" error

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.1, 0.2, 0.3, 0.4
    • Fix Version/s: 0.4
    • Component/s: Clustering
    • Labels:
      None

      Description

      KMeansDriver uses isConverged() method to determine if the k-means clustering run is complete. isConverged() has to open each SequenceFIle and read each cluster to see if the containing cluster is converged. During this process the readers are not explicitly closed, so in the case where there are a large number of sequence files opened, the driving system may run out of file handles before they are eventually implicitly reclaimed. I'm attaching a patch that explicitly closes these files as they are no longer needed to remain open.

        Activity

        Sean Owen made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Drew Farris made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Assignee Drew Farris [ drew.farris ]
        Fix Version/s 0.4 [ 12314396 ]
        Resolution Fixed [ 1 ]
        Hide
        Drew Farris added a comment - - edited

        applied in r944550, with minor revisions. Thanks for the patch.

        Show
        Drew Farris added a comment - - edited applied in r944550, with minor revisions. Thanks for the patch.
        Scott Ganyo made changes -
        Attachment KMeansDriver.patch [ 12444554 ]
        Scott Ganyo made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Scott Ganyo created issue -

          People

          • Assignee:
            Drew Farris
            Reporter:
            Scott Ganyo
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development