Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-1344

Self-Organizing Map algorithm (batch version)

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 0.8
    • Fix Version/s: 0.10.0
    • Component/s: Clustering
    • Labels:
      None

      Description

      Good morning.
      As part of my final year project, I have implemented a new module for Apache Mahout, implementing Kohonen's self-organizing map algorithm, in its batch version.

      The work is already done, and I will proceed to submit a patch ASAP. It was developed over Mahout 0.8.
      The patch includes unit tests and the algorithm was successfully used in a Hadoop cluster to cluster two big datasets. Results can be seen in this image gallery.

      The implementation uses the generic clustering algorithms implemented in the ClusterIterator class. Minor changes were made to this and other related classes to support some of the features, without affecting the execution of other algorithms.

      The algorithm supports convergence and the ability to resume a work at a given iteration (mainly, in order to initialize KohonenBatchClusteringPolicy with a given iteration number, althought it also affects the names of the output directories).

        Attachments

        1. MAHOUT-1344.patch
          163 kB
          Álvaro Pérez Alarcón

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              alvaropa Álvaro Pérez Alarcón
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: