The KafkaConsumer Javadoc says that: "The consumer will automatically ping the cluster periodically, which lets the cluster know that it is alive. As long as the consumer is able to do this it is considered alive and retains the right to consume from the partitions assigned to it." This is false. The heartbeat process is neither automatic nor periodic. The consumer heartbeats exactly once when poll() is called. The consumer's run thread is responsible for calling poll() before session.timeout.ms elapses.
Based on this misinformation, it is easy for a naive implementer to build a batch-based kafka consumer that takes longer than session.timeout.ms between poll() calls and encounter very ugly rebalance loops that can be very hard to diagnose. Clarification in the docs would help a lot--I'll submit a patch shortly.