Uploaded image for project: 'Commons Codec'
  1. Commons Codec
  2. CODEC-187

Beider Morse Phonetic Matching producing incorrect tokens

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.9
    • 1.10
    • None

    Description

      I believe the Beider Morse Phonetic Matching algorithm was added in Commons Codec 1.6

      The BMPM algorithm is an EVOLVING algorithm that is currently on version 3.02 though it had been static since version 3.01 dated 19 Dec 2011 (it was first available as opensource as version 1.00 on 6 May 2009).

      I can see nothing in the Commons Codec Docs to say which version of BMPM was implemented so I am not sure if the problem with the algorithm as coded in the Codec is simply an old version or whether there are more basic problems with the implementation.

      How do I determine the version of the algorithm that was implemented in the Commons Codec?

      How do we ensure that the algorithm is updated if/when the BMPM algorithm changes?

      How do we ensure that the algorithm as coded in the Commons Codec is accurate and working as expected?

      Attachments

        1. CODEC-187.patch
          9 kB
          Thomas Neidhart
        2. CODEC-187_ashkenazi_approx_any.patch
          9 kB
          Thomas Neidhart
        3. CODEC-187_ashkenazi_approx_any_v2.patch
          13 kB
          Thomas Neidhart
        4. CODEC_187_sync_with_v3.3.diff
          47 kB
          Thomas Neidhart

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mikkitobi michael tobias
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: