[OPENNLP-59] Bad precision using FMeasure - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: tools-1.5.1-incubating
Fix Version/s: tools-1.5.1-incubating
Component/s: None
Labels:
None

Description

I noticed bad precision in FMeasure results. I think the issue is that the current implementation is summing divisions. It computes the precision and recall for every sample, and after adds the results for each sample to compute the overall result. By doing that, the error related to each division are summed and can impact the final result.
I found the problem while implementing the ChunkerEvaluator. To verify the evaluator I tried to compare the results we get using OpenNLP and the Perl script conlleval available at http://www.cnts.ua.ac.be/conll2000/chunking/output.html. The results were always different if I process more than one sentence, because the implementation was using FMeasure.updateScores() that was summing divisions.
To solve that and have the same results provided by conll I basically stopped using the Mean class.

Attachments

Activity

People

Assignee:: William Colen

Reporter:: William Colen

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 05/Jan/11 16:36

Updated:: 19/Jan/11 11:31

Resolved:: 19/Jan/11 11:30