Details
-
Bug
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
-
None
Description
The Spark runner currently runs PAsserts via `PAssertStreaming` which groups into a single key and asserts the values on the worker (part of the "Lambda" in the Spark lingo).
This could be a problem since Spark won't run if there's nothing to process - so that if for some reason the input is missed, say reading from Kafka latest or simply an empty topic, the assertion will be skipped and so we'll never fail (we would like to fail if there was no input, while we expected one).
This might change once Spark provide a better support for the Beam model in streaming, but until then, it's best that our tests will consider this case as well.
I'll add an aggregator and increment for assertion, at the end I'll make sure the aggregator is not 0, so that at least one assertion took place (if for some reason Spark kept on for a couple of more intervals it might execute the same assertion more then once, if the input is repeated).
Attachments
Issue Links
- links to