Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2982

Upgrade Apache Commons Codec to version 1.6 in order to add new Beider-Morse Phonetic Matching (BMPM) option

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Apache Commons Codec released version 1.6 of their codec pack in November, 2011. Along with a few bug fixes, 1.6 contains a great new phonetic matching system called Beider-Morse Phonetic Matching (BMPM) that is far superior to the existing phonetic codecs, such as regular soundex, metaphone, caverphone, and so on. BMPM has actually been available for some time, but this is the first port of it to java, and its first commit in the Apache ecosystem.

      For a lot more information, see here: http://stevemorse.org/phoneticinfo.htm and http://stevemorse.org/phonetics/bmpm.htm

      BMPM would be a fantastic "soundalike" tool to help search for personal names (or just surnames) in a Solr/Lucene index, much better than Levenshtein distance for this use case.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            asparagirl Brooke Schreier Ganz
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment