Uploaded image for project: 'Commons Codec'
  1. Commons Codec
  2. CODEC-233

Soundex should support more algorithm variants

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.11
    • None

    Description

      The existing Soundex class was designed around the American Soundex algorithm.

      Whilst it offers some flexibility with the mapping of letters to Soundex numbers, the list of the 'silent' letters H and W is built-in to the code. There is no provision for changing the set of silent (ignored) letters.

      There is also no way to change the designation of HW from silent into consonant separator - i.e. code 0 - because that is how HW are currently encoded in the public API.

      To fix this, the mapping can be enhanced to support an extra code for 'silent' letters.

      A mapping which includes such a code did not have defined behaviour previously, so can be treated differently - there is no need to assume HW are silent.

      This allows for the definition of alternative silent letters.

      It can also be used to map HW as code '0' - as long as there is at least one 'silent' code.

      If there are no actual silent letters in the algorithm variant, then the code can be appended to the end of the mapping. This will not affect processing as only letters A-Z are passed to the method.

      An alternative would be to introduce yet another code as an alias for '0', and only treat HW as silent if they have code '0'.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            sebb Sebb
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment