[LUCENE-5699] Lucene classification score calculation normalize and return lists - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 5.0, 6.0
Component/s: modules/classification
Labels:
- gsoc2014

Lucene Fields:

New

Description

Now the classifiers can return only the "best matching" classes. If somebody want it to use more complex tasks he need to modify these classes for get second and third results too. If it is possible to return a list and it is not a lot resource why we dont do that? (We iterate a list so also.)

The Bayes classifier get too small return values, and there were a bug with the zero floats. It was fixed with logarithmic. It would be nice to scale the class scores sum vlue to one, and then we coud compare two documents return score and relevance. (If we dont do this the wordcount in the test documents affected the result score.)

With bulletpoints:

In the Bayes classification normalized score values, and return with result lists.
In the KNN classifier possibility to return a result list.
Make the ClassificationResult Comparable for list sorting.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

06-06-5699.patch
08/Jun/14 09:18
19 kB
Gergő Törcsvári
0730.patch
30/Jul/14 07:05
10 kB
Gergő Törcsvári
0803-base.patch
03/Aug/14 19:42
10 kB
Gergő Törcsvári
0810-base.patch
10/Aug/14 08:46
15 kB
Gergő Törcsvári

Activity

People

Assignee:: Tommaso Teofili

Reporter:: Gergő Törcsvári

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 23/May/14 09:00

Updated:: 28/Aug/22 14:08

Resolved:: 03/Nov/14 08:06