Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46409

Spark Connect Repl does not work with ClosureCleaner

    XMLWordPrintableJSON

Details

    Description

      SPARK-45136 added ClosureCleaner support to SparkConnect client. Unfortunately, this change breaks ConnectRepl launched by `./connector/connect/bin/spark-connect-scala-client`. To reproduce the issue:

      1. Run `./connector/connect/bin/spark-connect-shell`
      2. Run  `./connector/connect/bin/spark-connect-scala-client`
      3. In the REPL, execute this code:
        ```
        @ def plus1(x: Int): Int = x + 1
        @ val plus1_udf = udf(plus1 _)
        ```

      This will fail with the following error:

      ```

      java.lang.reflect.InaccessibleObjectException: Unable to make private native java.lang.reflect.Field[] java.lang.Class.getDeclaredFields0(boolean) accessible: module java.base does not "opens java.lang" to unnamed module @45099dd3
        java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
        java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
        java.lang.reflect.Method.checkCanSetAccessible(Method.java:199)
        java.lang.reflect.Method.setAccessible(Method.java:193)
        org.apache.spark.util.ClosureCleaner$.getFinalModifiersFieldForJava17(ClosureCleaner.scala:577)
        org.apache.spark.util.ClosureCleaner$.setFieldAndIgnoreModifiers(ClosureCleaner.scala:560)
        org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18(ClosureCleaner.scala:533)
        org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$18$adapted(ClosureCleaner.scala:525)
        scala.collection.ArrayOps$WithFilter.foreach(ArrayOps.scala:73)
        org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16(ClosureCleaner.scala:525)
        org.apache.spark.util.ClosureCleaner$.$anonfun$cleanupAmmoniteReplClosure$16$adapted(ClosureCleaner.scala:522)
        scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576)
        scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574)
        scala.collection.AbstractIterable.foreach(Iterable.scala:933)
        scala.collection.IterableOps$WithFilter.foreach(Iterable.scala:903)
        org.apache.spark.util.ClosureCleaner$.cleanupAmmoniteReplClosure(ClosureCleaner.scala:522)
        org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:251)
        org.apache.spark.sql.expressions.SparkConnectClosureCleaner$.clean(UserDefinedFunction.scala:210)
        org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:187)
        org.apache.spark.sql.expressions.ScalarUserDefinedFunction$.apply(UserDefinedFunction.scala:180)
        org.apache.spark.sql.functions$.udf(functions.scala:7956)
        ammonite.$sess.cmd1$Helper.<init>(cmd1.sc:1)
        ammonite.$sess.cmd1$.<clinit>(cmd1.sc:7)

      ```

       

      This is because ClosureCleaner is heavily reliant on using reflection API and is not compatible with Java 17. The rest of Spark bypasses this by adding `--add-opens` JVM flags, see https://issues.apache.org/jira/browse/SPARK-36796. We need to add these options to Spark Connect Client launch script as well

      Attachments

        Issue Links

          Activity

            People

              vsevolod.stepanov Vsevolod Stepanov
              vsevolod.stepanov Vsevolod Stepanov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: