Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.0-incubating, 3.1.0
-
None
-
None
Description
from email from Shady AbdelAziz February 11, 2013 on ctakes-dev@
While working with DateAnnotation and add some new state machines in the DateFSM.java, i found a minor bug regarding the starting and ending index of DateAnnotation.
Consider the small example
"October 2003 November 2010 cTAKES is the best framework".
The result is supposed to be "October 2003" and "November 2010", but cTAKES detects "October 2003" and "October 2003 November 2010".
This is because the FSM detects the first one and as it has no record in the "tokenStartMap" so it assumes the starting index as "0". Then it starts detecting the second date but also there is no record for it in the map yet(as there is a value in the map only when the state is a starting state, in other words a condition that is not satisfying any state), so it assumes the starting index is "0".
Thats why for example if there is an intermediate token between the two dates, it will work fine.
The solution is simply to put a record in the map before resetting the FSM.
so this line should be put "tokenStartMap.put(fsm, new Integer);".