Uploaded image for project: 'cTAKES'
  1. cTAKES
  2. CTAKES-158

DateAnnotation bug when two dates directly adjacent

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.0-incubating, 3.1.0
    • None
    • None

    Description

      from email from Shady AbdelAziz February 11, 2013 on ctakes-dev@

      While working with DateAnnotation and add some new state machines in the DateFSM.java, i found a minor bug regarding the starting and ending index of DateAnnotation.

      Consider the small example

      "October 2003 November 2010 cTAKES is the best framework".

      The result is supposed to be "October 2003" and "November 2010", but cTAKES detects "October 2003" and "October 2003 November 2010".

      This is because the FSM detects the first one and as it has no record in the "tokenStartMap" so it assumes the starting index as "0". Then it starts detecting the second date but also there is no record for it in the map yet(as there is a value in the map only when the state is a starting state, in other words a condition that is not satisfying any state), so it assumes the starting index is "0".

      Thats why for example if there is an intermediate token between the two dates, it will work fine.

      The solution is simply to put a record in the map before resetting the FSM.
      so this line should be put "tokenStartMap.put(fsm, new Integer);".

      Attachments

        Activity

          People

            Unassigned Unassigned
            james-masanz James Joseph Masanz
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: