Description
This is a follow-up issue to LUCENE-8784.
The KoreanNumberFilter is a TokenFilter that normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.
Logic is similar to JapaneseNumberFilter.
It should be able to cover the following test cases.
1) Korean Word to Number
십만이천오백 => 102500
2) 1 character conversion
일영영영 => 1000
3) Decimal Point Calculation
3.2천 => 3200
4) Comma between three digits
4,647.0010 => 4647.001
Attachments
Attachments
Issue Links
- is related to
-
LUCENE-8784 Nori(Korean) tokenizer removes the decimal point.
- Closed