Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.5.1
-
None
-
None
-
I'm using Spark 3.5.1 on Kubernetes with the Spark operator.
My project includes these depenedencies:
implementation 'org.apache.spark:spark-core_2.12:3.5.1'
implementation 'org.apache.spark:spark-sql_2.12:3.5.1'
implementation 'com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.17.0'
sparkConnectorShadowJar 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.1'
sparkConnectorShadowJar 'io.delta:delta-sharing-spark_2.12:3.1.0'
The `sparkConnectorShadowJar` is packaged into a shadow jar and copied onto the 'apache/spark:3.5.1' docker image.I'm using Spark 3.5.1 on Kubernetes with the Spark operator. My project includes these depenedencies: implementation 'org.apache.spark:spark-core_2.12:3.5.1' implementation 'org.apache.spark:spark-sql_2.12:3.5.1' implementation 'com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.17.0' sparkConnectorShadowJar 'org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.1' sparkConnectorShadowJar 'io.delta:delta-sharing-spark_2.12:3.1.0' The `sparkConnectorShadowJar` is packaged into a shadow jar and copied onto the 'apache/spark:3.5.1' docker image.
Description
I have a simple Spark application that is reading from a csv file via Delta Share and writing the contents to Kafka. When both the Delta Share Kafka SQL libraries are included in the project, Spark is unable to load them by their format short names.
If I use one of them without the other, everything works fine. When both are included, then I get this root exception: ClassNotFoundException: deltaSharing.DefaultSource.
If I specify the source class names (
io.delta.sharing.spark.DeltaSharingDataSource, org.apache.spark.sql.kafka010.KafkaSourceProvider) instead of the short names, it works correctly.