Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-735

PAssertStreaming should make sure the assertion happened.

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 0.3.0-incubating
    • runner-spark
    • None

    Description

      The Spark runner currently runs PAsserts via `PAssertStreaming` which groups into a single key and asserts the values on the worker (part of the "Lambda" in the Spark lingo).
      This could be a problem since Spark won't run if there's nothing to process - so that if for some reason the input is missed, say reading from Kafka latest or simply an empty topic, the assertion will be skipped and so we'll never fail (we would like to fail if there was no input, while we expected one).

      This might change once Spark provide a better support for the Beam model in streaming, but until then, it's best that our tests will consider this case as well.

      I'll add an aggregator and increment for assertion, at the end I'll make sure the aggregator is not 0, so that at least one assertion took place (if for some reason Spark kept on for a couple of more intervals it might execute the same assertion more then once, if the input is repeated).

      Attachments

        Issue Links

          Activity

            People

              amitsela Amit Sela
              amitsela Amit Sela
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: