Hide
I noted that a user was having trouble running a Spark Streaming + Kafka app on Spark 2.3.0 standalone, running from a binary release. It claimed not to find Spark-Kafka-related classes. I was surprised then when I looked at the contents of release 2.3.1 and 2.2.2 and found that jars/ contained:
spark-catalyst_2.11-2.3.1.jar
spark-core_2.11-2.3.1.jar
spark-graphx_2.11-2.3.1.jar
spark-hive-thriftserver_2.11-2.3.1.jar
spark-hive_2.11-2.3.1.jar
spark-kubernetes_2.11-2.3.1.jar
spark-kvstore_2.11-2.3.1.jar
spark-launcher_2.11-2.3.1.jar
spark-mesos_2.11-2.3.1.jar
spark-mllib-local_2.11-2.3.1.jar
spark-mllib_2.11-2.3.1.jar
spark-network-common_2.11-2.3.1.jar
spark-network-shuffle_2.11-2.3.1.jar
spark-repl_2.11-2.3.1.jar
spark-sketch_2.11-2.3.1.jar
spark-sql_2.11-2.3.1.jar
spark-streaming_2.11-2.3.1.jar
spark-tags_2.11-2.3.1.jar
spark-unsafe_2.11-2.3.1.jar
spark-yarn_2.11-2.3.1.jar
No spark-streaming-kafka or -sql modules. While I still feel I might be missing a reason for this, it really doesn't seem correct. Spark-Kafka apps won't work right now, and we ship other integrations for modules that are even off by default in the build.
The make-distribution.sh script does not appear to try to copy these JARs. Shouldn't it?
Show
I noted that a user was having trouble running a Spark Streaming + Kafka app on Spark 2.3.0 standalone, running from a binary release. It claimed not to find Spark-Kafka-related classes. I was surprised then when I looked at the contents of release 2.3.1 and 2.2.2 and found that jars/ contained:
spark-catalyst_2.11-2.3.1.jar
spark-core_2.11-2.3.1.jar
spark-graphx_2.11-2.3.1.jar
spark-hive-thriftserver_2.11-2.3.1.jar
spark-hive_2.11-2.3.1.jar
spark-kubernetes_2.11-2.3.1.jar
spark-kvstore_2.11-2.3.1.jar
spark-launcher_2.11-2.3.1.jar
spark-mesos_2.11-2.3.1.jar
spark-mllib-local_2.11-2.3.1.jar
spark-mllib_2.11-2.3.1.jar
spark-network-common_2.11-2.3.1.jar
spark-network-shuffle_2.11-2.3.1.jar
spark-repl_2.11-2.3.1.jar
spark-sketch_2.11-2.3.1.jar
spark-sql_2.11-2.3.1.jar
spark-streaming_2.11-2.3.1.jar
spark-tags_2.11-2.3.1.jar
spark-unsafe_2.11-2.3.1.jar
spark-yarn_2.11-2.3.1.jar
No spark-streaming-kafka or -sql modules. While I still feel I might be missing a reason for this, it really doesn't seem correct. Spark-Kafka apps won't work right now, and we ship other integrations for modules that are even off by default in the build.
The make-distribution.sh script does not appear to try to copy these JARs. Shouldn't it?