public class TestRussianAnalyzer extends TestCase {
Reader reader = new StringReader("text 1000");
public void testStemmer() {
testAnalyzer(new RussianAnalyzer());
}
public void testFixedRussianAnalyzer() {
testAnalyzer(new RussianAnalyzer(getRussianCharSet()));
}
private void testAnalyzer(RussianAnalyzer analyzer) {
try {
TokenStream stream = analyzer.tokenStream("text", reader);
assertEquals("text", stream.next().termText());
assertNotNull(stream.next());
} catch (IOException e) {
fail(e.getMessage());
}
}
private char[] getRussianCharSet() {
int length = RussianCharsets.UnicodeRussian.length;
final char[] russianChars = new char[length + 10];
System
.arraycopy(RussianCharsets.UnicodeRussian, 0, russianChars, 0, length);
russianChars[length++] = '0';
russianChars[length++] = '1';
russianChars[length++] = '2';
russianChars[length++] = '3';
russianChars[length++] = '4';
russianChars[length++] = '5';
russianChars[length++] = '6';
russianChars[length++] = '7';
russianChars[length++] = '8';
russianChars[length] = '9';
return russianChars;
}
}
I raised this on the dev list a few months ago and didn't get much response.
I think I might even be responsible for that code above. It was meant more as hack to get a customer up and running
.
Cheers,
Nick