Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-845

Reduce memory footprint for SystemConsumer

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Currently KafkaSystemConsumer will by default prefetch 50000 messages which is introduced in SAMZA-203. And according to Chris's comment in SAMZA-245 and SAMZA-775 comment, each message potentially will be buffered twice, one in KafkaSystemConsumer(bufferedMessages) and one in SystemConsumers(unprocessedMessagesBySSP). If each message is around 10k byte, we need to have 10k*50k*2 memory to buffer according to the comment.

      The reason we need to buffer twice is that BrokerProxy will actively fetch message if the total number of messages below the fetchThreshold(50000) to avoid potentially message latency performance issue and insert into KafkaSystemConsumer's bufferedMessages queue. Whenever message choosing is happen(SystemConsumers.choose is called), it will fetch messages from the bufferedMessages and insert into its own buffer "unprocessedMessagesBySSP" and later handle over to the streamTask for processing.

      It will be good to reduce the memory footprint here if this is the case. I would like to hear from others about whether this is an issue and would like myself to tab on this if this is the case.

      Attachments

        Activity

          People

            Unassigned Unassigned
            TaoFeng Tao Feng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: