Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5548

Improve flexibility and testability of the classification module

    XMLWordPrintableJSON

Details

    • New

    Description

      Lucene classification module's flexibility and capabilities may be improved with the following:

      • make it possible to use them "online" (or provide an online version of them) so that if the underlying index(reader) is updated the classifier doesn't need to be trained again to take into account newly added docs
      • eventually pass a different Analyzer together with the text to be classified (or directly a TokenStream) to specify custom tokenization/filtering.
      • normalize score calculations of existing classifiers
      • provide publicly available dataset based accuracy and speed tests
      • more Lucene based classification algorithms

      Specific subtasks for each of the above topics should be created to discuss each of them in depth.

      Attachments

        Activity

          People

            teofili Tommaso Teofili
            teofili Tommaso Teofili
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h