Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-163

Get (better) cluster labels using Log Likelihood Ratio

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.3
    • None
    • None

    Description

      Log Likelihood Ratio (LLR) is a better technique to identify cluster labels instead of the top features of the centroid vector. LLR finds terms/phrases which are common in the cluster but rare outside.

      Attachments

        1. mahout-cluster-labels-llr.patch
          29 kB
          Shashikant Kore
        2. MAHOUT-163-17sep.patch
          24 kB
          Shashikant Kore
        3. MAHOUT-163.patch
          23 kB
          Shashikant Kore
        4. MAHOUT-163.patch
          23 kB
          Grant Ingersoll
        5. MAHOUT-163.patch
          28 kB
          Grant Ingersoll
        6. MAHOUT-163.patch
          28 kB
          Grant Ingersoll
        7. mahout-163.patch
          29 kB
          Shashikant Kore

        Activity

          People

            gsingers Grant Ingersoll
            kshashi Shashikant Kore
            Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: