Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 4.0-ALPHA
    • modules/analysis
    • None
    • New, Patch Available

    Description

      An analyzer for hindi.

      below are MAP values on the FIRE 2008 test collection.
      QE means expansion with morelikethis, all defaults, on top 5 docs.

      setup T T(QE) TD TD(QE) TDN TDN(QE)
      words only 0.1646 0.1979 0.2241 0.2513 0.2468 0.2735
      HindiAnalyzer 0.2875 0.3071 0.3387 0.3791* 0.3837 0.3810
      improvement 74.67% 55.18% 51.14% 50.86% 55.47% 39.31%

      needs a bit of cleanup and more tests

      Attachments

        1. LUCENE-2234.patch
          62 kB
          Robert Muir
        2. LUCENE-2234.patch
          51 kB
          Robert Muir

        Activity

          People

            rcmuir Robert Muir
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: