Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-28603

Spark Streaming application receives inconsistent input events per batch interval

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 1.6.3
    • None
    • Spark Core
    • None

    Description

      We have a 2 sec batch interval for a Spark Streaming application. The Spark is configured to receive from RabbitMQ queue and batch interval was chosen based on the resources available in the Cluster and the processing time taken without causing scheduling delays. For each run we have defined the MaxReceiverRate, BlockInterval and BackPressure enabled to deliver consistent performance for each batch.

      For example, the MaxReceiverRate was given "75", BlockInterval = 50ms and backPressure enabled, we expect for 2 sec batch - 150 msgs should be delivered for a batch to process. Most of the time we are able to achieve this performance, but except for few cases, where few batches will receive "0" events and a following batch receives say 3000 msgs (> greater than the maxReceiverRate). we are not sure of this unexpected behavior of the batch sizing, because of which our application is causing great scheduling delays because of which the application processing is unable to catch up to the incoming msg rates.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Raja.Boyangari Raja
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 72h
                72h
                Remaining:
                Remaining Estimate - 72h
                72h
                Logged:
                Time Spent - Not Specified
                Not Specified