Description
KafkaUtils.createDirectStream does not work in python3 when you set parameter fromOffsets (which is starting offsets of the stream on Kafka). This is because the long type is removed from python3 and py4j maps numeric variables to java.lang.Integer or java.lang.Long depending on number size, which causes ClassCastException for small offsets variables.
This behaviour was noticed before and tests for this functionality are disabled in python3: https://github.com/apache/spark/blob/89e67d6667d5f8be9c6fb6c120fbcd350ae2950d/python/pyspark/streaming/tests.py#L1061
Attachments
Issue Links
- is duplicated by
-
SPARK-17411 Cannot set fromOffsets in createDirectStream function
- Resolved
-
SPARK-15559 TopicAndPartition should provide __hash__ method
- Closed
- links to