Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3279

Kafka trident spout could loose its position with EARLIEST or LATEST FirstPollOffsetStrategy

    Details

      Description

      In KafkaTridentSpoutEmitter emitPartitionBatch() function, when kafkaConsumer.poll(pollTimeoutMs) returns 0 records for the very first transaction where FirstPollOffsetStrategy is set to EARLIEST or LATEST, the spout fails to move to EARLIEST or LATEST, and continues from the last metadata position.

       

      The flow of events which would cause this bug :

       

      1. FirstPollOffsetStrategy set to EARLIEST or LATEST

      2. For first transaction after restart txid1 Based on link L164 ,

      The currentBatch is initialized to lastBatchMeta (which need not be null);

      3. Later in L171, the consumer seeks to "start" OR "end"

      4. Then consumer.poll(pollTimeoutMs) is called.

      5. If poll returns non 0 records , currentBatch is set to a new metadata . If poll returns 0 records,

      currentBatch is not reset ie, currentBatch is still lastBatchMeta (which need not be null)

       

      So now in transaction txid2 after txid1, isFirstPoll() returns false, and the spout continues from lastBatchMeta.

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Srdo Stig Rohde Døssing
                Reporter:
                janithkv Janith Kaiprath Valiyalappil
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: