Mahout
  1. Mahout
  2. MAHOUT-1052

Add an option to MinHashDriver that specifies the dimension of vector to hash (indexes or values)

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 0.8
    • Component/s: Clustering
    • Labels:

      Description

      Add a parameter to MinHash clustering that specifies the dimension of vector to hash (indexes or values). Current version of MinHash clustering only hashed values of vectors. Based on discussion on dev-mahout list, both of the use-cases are possible and frequently met in practice.
      Preserve backward compatibility with default dimension set to values. Add new unit tests.

      1. MAHOUT-1052.patch
        16 kB
        Suneel Marthi
      2. MAHOUT-1052.patch
        14 kB
        Elena Smirnova

        Activity

          People

          • Assignee:
            Suneel Marthi
            Reporter:
            Elena Smirnova
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development