Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1218

Streamimg k-means fails when the number of clusters specified is <= estimated map clusters

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 0.8
    • Fix Version/s: 0.8
    • Component/s: Clustering
    • Labels:
      None

      Description

      Running Streaming k-means with CosineDistanceMeasure, Fast Projection Search, number of clusters k= 60, number of estimated map clsuters -km = 60.

      Exception in thread "main" java.lang.IllegalArgumentException: Invalid number of estimated map clusters; There must be more than the final number of clusters (k log n vs k)
      	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
      	at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansDriver.configureOptionsForWorkers(StreamingKMeansDriver.java:327)
      	at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansDriver.configureOptionsForWorkers(StreamingKMeansDriver.java:280)
      	at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansDriver.run(StreamingKMeansDriver.java:227)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
      	at org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansDriver.main(StreamingKMeansDriver.java:472)
      

        Attachments

          Activity

            People

            • Assignee:
              smarthi Suneel Marthi
              Reporter:
              smarthi Suneel Marthi
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: