In Kafka 0.10.2.1 we change the default value of max.poll.intervall.ms for Kafka Streams to Integer.MAX_VALUE. The reason was that long state restore phases during rebalance could yield "rebalance storms" as consumers drop out of a consumer group even if they are healthy as they didn't call poll() during state restore phase.
In version 0.11 and 1.0 the state restore logic was improved a lot and thus, now Kafka Streams does call poll() even during restore phase. Therefore, we might consider setting a smaller timeout for max.poll.intervall.ms to detect bad behaving Kafka Streams applications (ie, targeting user code) that don't make progress any more during regular operations.
The open question would be, what a good default might be. Maybe the actual consumer default of 30 seconds might be sufficient. During one poll() roundtrip, we would only call restoreConsumer.poll() once and restore a single batch of records. This should take way less time than 30 seconds.