Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2505

Kafka Spout doesn't support voids in the topic (topic compaction not supported)

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.x
    • Fix Version/s: 2.0.0, 1.1.1, 1.2.0
    • Component/s: storm-kafka-client
    • Labels:
      None

      Description

      Kafka maintains the spout progress (offsets for partitions) which can hold a value which no longer exists (or offset+1 doesn't exist) in the topic due to following reasons

      • Topology stopped processing (or died) & topic got compacted (cleanup.policy=compact) leaving offset voids in the topic.
      • Topology stopped processing (or died) & Topic got cleaned up (cleanup.policy=delete) and the offset.

      When the topology starts processing again (or restarted), the spout logic suggests that the next offset has to be (committedOffset+1) for the spout to make progress, which will never be the case as (committedOffset+1) has been removed from the topic and will never be acked.

      OffsetManager.java
       if (currOffset == nextCommitOffset + 1) {            // found the next offset to commit
            found = true;
            nextCommitMsg = currAckedMsg;
            nextCommitOffset = currOffset;
      } else if (currOffset > nextCommitOffset + 1) {
            LOG.debug("topic-partition [{}] has non-continuous offset [{}]. It will be processed in a subsequent batch.", tp, currOffset);
      }
      

      A smart forwarding mechanism has to be built so as to forward the spout pivot to the next logical location, instead of a hardcoded single forward operation.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vivek Vivek Mittal
                Reporter:
                vivek Vivek Mittal
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 8h 50m
                  8h 50m