Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
1
Description
Guess we need to fix the way our bundles are packaged. For eg, I tried to query hudi table using hudi-utilities bundle and it succeeds w/ 0.10.1, but fails w/ master. Should be the same reason why integ test suite bundle fails to query hudi table.
./bin/spark-shell \ --packages org.apache.spark:spark-avro_2.11:2.4.4 \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars ~/Documents/personal/projects/apache_hudi_dec/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.10.1-rc2.jar scala> val df = spark.read.format("hudi").load("/tmp/hudi-deltastreamer-ny/") scala> df.count
./bin/spark-shell \ --packages org.apache.spark:spark-avro_2.11:2.4.4 \ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --jars ~/Documents/personal/projects/nov26/hudi/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.11.0-SNAPSHOT.jar scala> val df = spark.read.format("hudi").load("/tmp/hudi-deltastreamer-ny/") java.lang.ClassNotFoundException: Failed to find data source: hudi. Please find packages at http://spark.apache.org/third-party-projects.html at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:675) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:213) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:197) ... 49 elided Caused by: java.lang.ClassNotFoundException: hudi.DefaultSource at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:652) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20$$anonfun$apply$12.apply(DataSource.scala:652) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:652) at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$20.apply(DataSource.scala:652) at scala.util.Try.orElse(Try.scala:84) at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:652) ... 51 more
Original issue reported via github issue:
detailed in https://github.com/apache/hudi/issues/4621
Attachments
Issue Links
- links to