Details
-
Improvement
-
Status: Closed
-
Critical
-
Resolution: Workaround
-
None
-
None
-
Important
Description
As soon as I add at the end of my pipeline the negation AE:
aggregateBuilder.add( PolarityCleartkAnalysisEngine.createAnnotatorDescription() );
The pipeline becomes 50-100 times slower. This likely has to do with the line:
List<Sentence> sents = new ArrayList<>(JCasUtil.selectCovering(jCas, Sentence.class, entityOrEventMention.getBegin(), entityOrEventMention.getEnd()));
in AssertionCleartkAnalysisEngine. I am running the pipeline on large files (i.e. having a large number of sentences). The slowdown is caused by the code's obtaining all sentences in a document for each identified annotation.
The full pipeline is here:
https://github.com/dmitriydligach/ctakes-misc/blob/master/src/main/java/org/apache/ctakes/pipelines/UmlsLookupPipeline.java