Default way of creating stream out of Kafka source would be as
val stream = KafkaUtils.createStream(ssc,"localhost:2181","logs", Map("retarget" -> 2,"datapair" -> 2))
However, if two topics - in this case "retarget" and "datapair" - are very different, there is no way to set up different filter, mapping functions, etc), as they are effectively merged.
However, instance of KafkaInputDStream, created with this call internally calls ConsumerConnector.createMessageStream() which returns map of KafkaStreams, keyed by topic. It would be great if this map would be exposed somehow, so aforementioned call
val streamS = KafkaUtils.createStreamS(...)
returned map of streams.
- is cloned by
SPARK-12103 Clarify documentation of KafkaUtils createStream with multiple topics
- is part of
SPARK-3146 Improve the flexibility of Spark Streaming Kafka API to offer user the ability to process message before storing into BM