Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.2.2
-
None
Description
The client should be told which of its input CASes might have (indirectly) generated this failing CAS. Also is there any value in sending more than one exception if many children fail? If the aggregate is not a CM then the client should just be told that the input CAS failed.
Here is part of a recent email discussion on this problem:
I think I have a somewhat clearer picture of how we might handle errors on child CASes.
First consider Primitive & Aggregate CMs, and then a non-CM aggregate that contains a CM.
I can see 3 different ways an application may wish to handle errors on child CASes:
Primitive CM
Stop generating children/descendants of the input CAS and return an exception on the input CAS
Generate an "incomplete" CAS – perhaps marked as "damaged"
(useful when the total count must be preserved and a place-holder provided)
Ignore the error, generate no CAS and carry on to generate the next CAS (if any)
Aggregate CM
Stop generating any more children/descendants from the input CAS and return the exception on the input CAS
Allow the CAS to continue in the flow
Quietly drop the CAS, do not return it and do not generate an exception
Simple Aggregate with internal CM
Stop generating any more children/descendants from the input CAS and return the exception on the input CAS
Allow the CAS to continue in the flow (it will be dropped at the end of the flow)
Quietly drop the CAS as if it reached the end of the flow, and do not generate an exception
Currently our aggregate error-handling supports #2, while #3 doesn't depend on the framework. I have added aggregate support for #3 to the AdvancedFixedFlowController in the UIMA-AS test suite (as part of Jira 1353) in the form of a new AllowDropOnFailure option which specifies the delegates for which a failing CAS can be dropped, i.e. skip to the end of the flow with the forceCasToBeDropped flag set. (I used it to test the thresholdWindow error handling to verify that an intermittently failing delegate is disabled when N of the last M CASes fail.)
But I don't think our docs indicate what should happen in #1 and the current implementation handles it differently ... the exception is associated with the child CAS without any reference to the input CAS, and the CM continues to generate children, so the client can get many exceptions that refer to unknown CASes. The getParentCasReferenceId() method in the UimaASProcessStatus (which I could not find in the JavaDocs) can be used to associate a child CAS with the input CAS that generated it, but it is always null when an exception is returned.
Consider the information available to the entityProcessComplete callback when an input CAS successfully generates 2 children:
returnedCAS getCasReferenceId() getParentCasReferenceId() isException()
Child1 ID-of-Child1 ID-of-Parent false
Child2 ID-of-Child2 ID-of-Parent false
Parent ID-of-Parent null false
If the 2nd child causes an exception then the client might see (Option A)
returnedCAS getCasReferenceId() getParentCasReferenceId() isException()
Child1 ID-of-Child1 ID-of-Parent false
null ID-of-Parent null true
Or we could put the failing child's ID in the status (Option B)
returnedCAS getCasReferenceId() getParentCasReferenceId() isException()
Child1 ID-of-Child1 ID-of-Parent false
null ID-of-Child2 ID-of-Parent true
Note that in an Aggregate CM the failing CAS may not have been generated directly by the parent, but by any one of its descendants.
I think option A is cleaner and easier to document ... "exception always on input CAS". If the ID of the failing child is useful we could wrap the exception in another that said something like "Exception inherited from generated CAS xyz"
Any other options we should consider?
I'll put this in a Jira as that may be the better place to discuss it.