Uploaded image for project: 'Apache Hop (Retired)'
  1. Apache Hop (Retired)
  2. HOP-3844

Failed to run example "input-process-output" pipeline in Spark.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 2.1.0
    • Beam, Pipelines, Spark
    • None
    • Spark on Kubernetes

    Description

      I tried to run an example Hop pipeline ("input-process-output") in Spark. But the pipeline failed with the below exception:
      Caused by: java.lang.RuntimeException: java.lang.VerifyError: class scala.collection.convert.Wrappers$JListWrapper overrides final method scala.collection.mutable.AbstractBuffer.$plus$eq$colon(Ljava/lang/Object;)Lscala/collection/mutable/Buffer; at org.apache.beam.runners.spark.SparkPipelineResult.runtimeExceptionFrom(SparkPipelineResult.java:60) at org.apache.beam.runners.spark.SparkPipelineResult.beamExceptionFrom(SparkPipelineResult.java:77) at org.apache.beam.runners.spark.SparkPipelineResult.waitUntilFinish(SparkPipelineResult.java:104) at org.apache.hop.beam.engines.BeamPipelineEngine.evaluatePipelineStatus(BeamPipelineEngine.java:486) at org.apache.hop.beam.engines.BeamPipelineEngine.populateEngineMetrics(BeamPipelineEngine.java:370) at org.apache.hop.beam.engines.BeamPipelineEngine$1.run(BeamPipelineEngine.java:340) ... 2 more Caused by: java.lang.VerifyError: class scala.collection.convert.Wrappers$JListWrapper overrides final method scala.collection.mutable.AbstractBuffer.$plus$eq$colon(Ljava/lang/Object;)Lscala/collection/mutable/Buffer; at java.base/java.lang.ClassLoader.defineClass1(Native Method) at java.base/java.lang.ClassLoader.defineClass(Unknown Source) at java.base/java.security.SecureClassLoader.defineClass(Unknown Source) at java.base/java.net.URLClassLoader.defineClass(Unknown Source) at java.base/java.net.URLClassLoader$1.run(Unknown Source) at java.base/java.net.URLClassLoader$1.run(Unknown Source) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/java.net.URLClassLoader.findClass(Unknown Source) at java.base/java.lang.ClassLoader.loadClass(Unknown Source) at java.base/java.lang.ClassLoader.loadClass(Unknown Source) at scala.collection.convert.WrapAsScala$class.asScalaBuffer(WrapAsScala.scala:105) at scala.collection.JavaConversions$.asScalaBuffer(JavaConversions.scala:52) at scala.collection.JavaConversions.asScalaBuffer(JavaConversions.scala) at org.apache.beam.runners.spark.io.SourceRDD$Bounded.<clinit>(SourceRDD.java:77) at org.apache.beam.runners.spark.translation.TransformTranslator$11.evaluate(TransformTranslator.java:641) at org.apache.beam.runners.spark.translation.TransformTranslator$11.evaluate(TransformTranslator.java:634) at org.apache.beam.runners.spark.SparkRunner$Evaluator.doVisitTransform(SparkRunner.java:449) at org.apache.beam.runners.spark.SparkRunner$Evaluator.visitPrimitiveTransform(SparkRunner.java:438) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:593) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:585) at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$500(TransformHierarchy.java:240) at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:214) at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:469) at org.apache.beam.runners.spark.SparkRunner.lambda$run$1(SparkRunner.java:233) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)
       
      I was running the Spark on Kubernetes. I tried Spark 3.2.1, Spark 3.1.3, and Spark 3.1.2 images. And I got the same exception on all three.
       
      To recap what I did:
      I built a Hop fat jar, and the metadata JSON file. I put fat jar, metadata json file, pipeline file, and the required input to the Spark. Then I built a Spark image with Hop pipeline inside. I start this image in Kubernetes using "spark-submit". The driver pod is started, so is the executor. But then the executor is terminated. Checking the driver pod, I found the above exception. I am not executing the job remotely.
       
      I am running Hop 1.2.0. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            jwu-kinaxis Jian Wu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: