Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-966

Mismatch in the number of points given by the clusterDumper and ClusterOutputPostProcessor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • 0.6
    • 0.8
    • classic
    • None
    • hadoop 0.20.2 mahout 0.6

    Description

      After running the post processor the number of points that each cluster contains is not matching the number of points each cluster should contain as stated by clusterdumper.

      MSV-287

      { n=90 c=[0.05195, 0.05675, 0.07151, 0.05713, 0.06946,...}

      MSV-145

      { n=90 c=[0.93685, 0.93071, 0.93641, 0.94629, 0.94409,..}

      the n mentioned in clusters-n-final against each cluster is different from the number of points actually contained in d directory for each cluster. Any idea why is this happening ...?

      Attachments

        1. clusterpp-output.txt
          14 kB
          Tharindu Mathew
        2. cluster-dumper-output.txt
          1.59 MB
          Tharindu Mathew
        3. mtestdata.txt
          1.76 MB
          Gaurav Redkar
        4. points100dCCNorm.txt
          1.77 MB
          Gaurav Redkar

        Activity

          People

            gsingers Grant Ingersoll
            gaurav14 Gaurav Redkar
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: