Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-45598

Delta table 3.0.0 not working with Spark Connect 3.5.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.5.0
    • None
    • Connect
    • None

    Description

      Spark version 3.5.0

      Spark Connect version 3.5.0

      Delta table 3.0.0

      Spark connect server was started using

      ./sbin/start-connect-server.sh --master spark://localhost:7077 --packages org.apache.spark:spark-connect_2.12:3.5.0,io.delta:delta-spark_2.12:3.0.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" --conf 'spark.jars.repositories=https://oss.sonatype.org/content/repositories/iodelta-1120'

      Connect client depends on
      libraryDependencies += "io.delta" %% "delta-spark" % "3.0.0"
      and the connect libraries
       

      When trying to run a simple job that writes to a delta table

      val spark = SparkSession.builder().remote("sc://localhost").getOrCreate()
      val data = spark.read.json("profiles.json")
      data.write.format("delta").save("/tmp/delta")

       

      Error log in connect client

      Exception in thread "main" org.apache.spark.SparkException: io.grpc.StatusRuntimeException: INTERNAL: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 4) (172.23.128.15 executor 0): java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.sql.catalyst.expressions.ScalaUDF.f of type scala.Function1 in instance of org.apache.spark.sql.catalyst.expressions.ScalaUDF
          at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2301)
          at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1431)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2437)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
      ...
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$.toThrowable(GrpcExceptionConverter.scala:110)
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$.convert(GrpcExceptionConverter.scala:41)
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$$anon$1.hasNext(GrpcExceptionConverter.scala:49)
          at scala.collection.Iterator.foreach(Iterator.scala:943)
          at scala.collection.Iterator.foreach$(Iterator.scala:943)
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$$anon$1.foreach(GrpcExceptionConverter.scala:46)
          at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
          at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
          at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
          at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
          at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
          at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$$anon$1.to(GrpcExceptionConverter.scala:46)
          at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
          at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
          at org.apache.spark.sql.connect.client.GrpcExceptionConverter$$anon$1.toBuffer(GrpcExceptionConverter.scala:46)
          at org.apache.spark.sql.SparkSession.execute(SparkSession.scala:554)
          at org.apache.spark.sql.DataFrameWriter.executeWriteOperation(DataFrameWriter.scala:257)
          at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:221)
          at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:210)
          at Main$.main(Main.scala:11)
          at Main.main(Main.scala)

       

      Error log in spark connect server

      23/10/13 12:26:32 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1) (172.23.128.15 executor 0): java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field org.apache.spark.sql.catalyst.expressions.ScalaUDF.f of type scala.Function1 in instance of org.apache.spark.sql.catalyst.expressions.ScalaUDF
          at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2301)
          at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1431)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2437)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2311)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
          at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:527)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1184)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2322)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
          at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
          at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
          at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
          at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
          at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:87)
          at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:129)
          at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:86)
          at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
          at org.apache.spark.scheduler.Task.run(Task.scala:141)
          at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
          at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
          at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
          at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
          at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          at java.lang.Thread.run(Thread.java:750)

      Attachments

        Activity

          People

            Unassigned Unassigned
            haldefaiz Faiz Halde
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: