Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2505

Kafka Spout doesn't support voids in the topic (topic compaction not supported)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.x
    • 2.0.0, 1.1.1, 1.2.0
    • storm-kafka-client
    • None

    Description

      Kafka maintains the spout progress (offsets for partitions) which can hold a value which no longer exists (or offset+1 doesn't exist) in the topic due to following reasons

      • Topology stopped processing (or died) & topic got compacted (cleanup.policy=compact) leaving offset voids in the topic.
      • Topology stopped processing (or died) & Topic got cleaned up (cleanup.policy=delete) and the offset.

      When the topology starts processing again (or restarted), the spout logic suggests that the next offset has to be (committedOffset+1) for the spout to make progress, which will never be the case as (committedOffset+1) has been removed from the topic and will never be acked.

      OffsetManager.java
       if (currOffset == nextCommitOffset + 1) {            // found the next offset to commit
            found = true;
            nextCommitMsg = currAckedMsg;
            nextCommitOffset = currOffset;
      } else if (currOffset > nextCommitOffset + 1) {
            LOG.debug("topic-partition [{}] has non-continuous offset [{}]. It will be processed in a subsequent batch.", tp, currOffset);
      }
      

      A smart forwarding mechanism has to be built so as to forward the spout pivot to the next logical location, instead of a hardcoded single forward operation.

      Attachments

        Issue Links

          Activity

            People

              vivek Vivek Mittal
              vivek Vivek Mittal
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 50m
                  8h 50m