Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-966

Mismatch in the number of points given by the clusterDumper and ClusterOutputPostProcessor

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: 0.6
    • Fix Version/s: 0.8
    • Component/s: Integration
    • Labels:
      None
    • Environment:

      hadoop 0.20.2 mahout 0.6

      Description

      After running the post processor the number of points that each cluster contains is not matching the number of points each cluster should contain as stated by clusterdumper.

      MSV-287

      { n=90 c=[0.05195, 0.05675, 0.07151, 0.05713, 0.06946,...}

      MSV-145

      { n=90 c=[0.93685, 0.93071, 0.93641, 0.94629, 0.94409,..}

      the n mentioned in clusters-n-final against each cluster is different from the number of points actually contained in d directory for each cluster. Any idea why is this happening ...?

        Attachments

        1. points100dCCNorm.txt
          1.77 MB
          Gaurav Redkar
        2. mtestdata.txt
          1.76 MB
          Gaurav Redkar
        3. clusterpp-output.txt
          14 kB
          Tharindu Mathew
        4. cluster-dumper-output.txt
          1.59 MB
          Tharindu Mathew

          Activity

            People

            • Assignee:
              gsingers Grant Ingersoll
              Reporter:
              gaurav14 Gaurav Redkar
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: