Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2383

With auto.offset.reset, KafkaReceiver potentially deletes Consumer nodes from Zookeeper

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: DStreams
    • Labels:
      None

      Description

      When auto.offset.reset is set in the Kafka configuration, then KafkaReceiver's tryZookeeperConsumerGroupCleanup() will delete the whole /consume/<groupId> tree in Zookeeper before creating consumer nodes. If there are already consumer nodes present (this may happen when multiple KafkaReceivers in the same consumer group are launched), they are deleted as well, leading to subsequent NoNode exceptions, for example, on rebalance.

      There should be a check before the delete like if (zk.countChildren(dir + "/ids") == 0) ... (ideally in an atomic way) in order to prevent deleting existing consumer nodes.

      (Also note that the behavior of auto.offset.reset as realized by Spark's Kafka receiver differs from the behavior defined in Kafka's documentation.)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tgpfeiffer Tobias Pfeiffer
              • Votes:
                1 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: