Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-344

Minhash based clustering

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.3
    • Fix Version/s: 0.4
    • Component/s: Clustering
    • Labels:
      None

      Description

      Minhash clustering performs probabilistic dimension reduction of high dimensional data. The essence of the technique is to hash each item using multiple independent hash functions such that the probability of collision of similar items is higher. Multiple such hash tables can then be constructed to answer near neighbor type of queries efficiently.

        Attachments

        1. MAHOUT-344-v1.patch
          17 kB
          Ankur Bansal
        2. MAHOUT-344-v2.patch
          34 kB
          Cristi Prodan
        3. MAHOUT-344-v3.patch
          41 kB
          Cristi Prodan
        4. MAHOUT-344-v4.patch
          39 kB
          Ankur Bansal
        5. MAHOUT-344-v5.patch
          28 kB
          Ankur Bansal
        6. MAHOUT-344-v6.patch
          42 kB
          Ankur Bansal
        7. MAHOUT-344-v7.patch
          48 kB
          Ankur Bansal

          Activity

            People

            • Assignee:
              ankur Ankur Bansal
              Reporter:
              ankur Ankur Bansal
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: