Uploaded image for project: 'cTAKES'
  1. cTAKES
  2. CTAKES-231

missing NEs because of inconsistent chunking for parallel sentence constructions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Workaround
    • 3.0-incubating
    • 4.0.0
    • ctakes-chunker
    • None

    Description

      cancer of colon, lung and liver
      results in an annotation for liver cancer

      cancer of colon, liver and lung.
      does not result in an annotation for liver cancer or for lung cancer.

      Thanks Dennis Lee Hon Kit for reporting this.

      Details:

      Reproduced by running 3.0.0-incubating with the separately downloadable UMLS resources, using the AggregatePlaintextUMLSProcessor.xml, results in these chunk annotations:

      [0] org.apache.ctakes.typesystem.type.syntax.NP
      [1] org.apache.ctakes.typesystem.type.syntax.PP
      [2] org.apache.ctakes.typesystem.type.syntax.NP
      [3] org.apache.ctakes.typesystem.type.syntax.NP
      [4] org.apache.ctakes.typesystem.type.syntax.PP
      [5] org.apache.ctakes.typesystem.type.syntax.NP
      [6] org.apache.ctakes.typesystem.type.syntax.O
      [7] org.apache.ctakes.typesystem.type.syntax.O
      [8] org.apache.ctakes.typesystem.type.syntax.NP

      Attachments

        1. liver.cancer.chunking.issue.xmi.xml
          24 kB
          James Joseph Masanz

        Activity

          People

            seanfinan Sean Finan
            james-masanz James Joseph Masanz
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: