Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-2687 Elasticity: scale up task count beyond the input partition count.
  3. SAMZA-2724

[Elasticity] optimizations to improve throughput when elasticity is enabled by filtering out unwanted messages within SystemConsumers before RunLoop

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      When elasticity is enabled, the following optimizations can be done to improve throughput

       

      in SystemConsumers, filter the messages which are not going to be processed by the RunLoop - aka of the messages fetched from consumer.poll(ssp) remove those messages which belong to the key buckets of the ssp not consumed by the container's job model. This will ensure RunLoop gets only those messages that it needs to process

       

      Note that during prototyping this optimization, it was observed that this filtering causes a delay in the start of processing in all containers. this is due to all messages from an ssp being filtered out initially for ~7-10mins. This could be due to how the messages are fetched from the specific input topic. need a deeper investigation.

      Attachments

        Activity

          People

            lakshmi-manasa Lakshmi Manasa Gaduputi
            lakshmi-manasa Lakshmi Manasa Gaduputi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: