Issue Details (XML | Word | Printable)

Key: CODEC-17
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Tim O'Brien
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Commons Codec

[codec] Metaphone B not handling ending MB correctly

Created: 19/Apr/04 03:48 AM   Updated: 27/Oct/07 06:49 AM
Return to search
Component/s: None
Affects Version/s: Nightly Builds
Fix Version/s: 1.3

Time Tracking:
Not Specified

Environment:
Operating System: other
Platform: Other

Bugzilla Id: 28457


 Description  « Hide
Error in case for 'B', if a word ends in "MB" (ie "COMB"), Metaphone should
not add B to the code.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Tim O'Brien added a comment - 19/Apr/04 09:39 AM
This issue has been addressed, here is an excerpt from one of my emails to
commons-dev for the record:

I uncovered a potential bug in Metaphone. The code in question deals
with
> the encoding of 'B':
>
> // START CODE from Metaphone
>
> case 'B' :
> if ((n > 0) && !(n + 1 == wdsz) &&
> (local.charAt(n - 1) == 'M')) { // not MB at end of word > code.append(symb); > } else { > code.append(symb); > }
> mtsz++;
> break;
>
> // END CODE
>
> My understanding is that we should not encode a 'B' if a word ends in
> "MB".
> (Following:
http://aspell.sourceforge.net/metaphone/metaphone-kuhn.txt)So
> the Metaphone of "COMB" is "KM" not "TMB", and the Metaphone of "TOMB"
is
> "TM" not "TMB". I "refactored" this code a bit and came up with the
> following:
>
> case 'B' :
> if ( isPreviousChar(local, n, 'M') &&
> isLastChar(wdsz, n) ) { > // B is silent if word ends in MB > break; > } else {> code.append(symb);> } }
> break;
>
> Also, this code was (outright) copied from a C++ program, there was no
> need to keep track of the length of our StringBuffer in a variable
> named "mtsz".
> That's gone, and the only reason this was possible was great code
> coverage.