Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8869

Build kuromoji system dictionary as a separated jar and load it from JapaneseTokenizer at runtime

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • modules/analysis
    • None
    • New

    Description

      This is a sub-task for LUCENE-8816.
      In this issue, I will try to make small but self-contained changes to kuromoji system dictionary.

      • Make it possible to build a jar that contains (maybe) only dictionary data resource generated by the build-dict task.
        • Maybe a new ant target will be added.
      • Make it possible to load external dictionary when initializing JapaneseTokenizer.
      • Decouple current system dictionary data (mecab ipadic) from kuromoji itself and use it as default (Possibly it can be done with another issue).

      Also, some refactoring of the directory/source tree structure may be needed.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tomoko Tomoko Uchida
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: