Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5548

Improve flexibility and testability of the classification module

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Lucene Fields:
      New

      Description

      Lucene classification module's flexibility and capabilities may be improved with the following:

      • make it possible to use them "online" (or provide an online version of them) so that if the underlying index(reader) is updated the classifier doesn't need to be trained again to take into account newly added docs
      • eventually pass a different Analyzer together with the text to be classified (or directly a TokenStream) to specify custom tokenization/filtering.
      • normalize score calculations of existing classifiers
      • provide publicly available dataset based accuracy and speed tests
      • more Lucene based classification algorithms

      Specific subtasks for each of the above topics should be created to discuss each of them in depth.

        Attachments

          Activity

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              teofili Tommaso Teofili

              Dates

              • Created:
                Updated:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 0.5h
              0.5h

                Issue deployment