Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15343

NoClassDefFoundError when initializing Spark with YARN

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: YARN
    • Labels:
      None

      Description

      I'm trying to connect Spark 2.0 (compiled from branch-2.0) with Hadoop.

      Spark compiled with:

      ./dev/make-distribution.sh -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver -Dhadoop.version=2.6.0 -DskipTests
      

      I'm getting following error

      mbrynski@jupyter:~/spark$ bin/pyspark
      Python 3.4.0 (default, Apr 11 2014, 13:05:11)
      [GCC 4.8.2] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      Warning: Master yarn-client is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel).
      16/05/16 11:54:41 WARN SparkConf: The configuration key 'spark.yarn.jar' has been deprecated as of Spark 2.0 and may be removed in the future. Please use the new key 'spark.yarn.jars' instead.
      16/05/16 11:54:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      16/05/16 11:54:42 WARN AbstractHandler: No Server set for org.spark_project.jetty.server.handler.ErrorHandler@f7989f6
      16/05/16 11:54:43 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
      Traceback (most recent call last):
        File "/home/mbrynski/spark/python/pyspark/shell.py", line 38, in <module>
          sc = SparkContext()
        File "/home/mbrynski/spark/python/pyspark/context.py", line 115, in __init__
          conf, jsc, profiler_cls)
        File "/home/mbrynski/spark/python/pyspark/context.py", line 172, in _do_init
          self._jsc = jsc or self._initialize_context(self._conf._jconf)
        File "/home/mbrynski/spark/python/pyspark/context.py", line 235, in _initialize_context
          return self._jvm.JavaSparkContext(jconf)
        File "/home/mbrynski/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 1183, in __call__
        File "/home/mbrynski/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 312, in get_return_value
      py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
      : java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
              at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:45)
              at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:163)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
              at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:150)
              at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
              at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:148)
              at org.apache.spark.SparkContext.<init>(SparkContext.scala:502)
              at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
              at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
              at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
              at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
              at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
              at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
              at py4j.Gateway.invoke(Gateway.java:236)
              at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
              at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
              at py4j.GatewayConnection.run(GatewayConnection.java:211)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig
              at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
              at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
              at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
              ... 19 more
      

      On 1.6 everything works fine. I'm using HDP2.2 (Hadoop 2.6.0)
      I have HADOOP_CONF_DIR and SPARK_HOME env variables.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                maver1ck Maciej BryƄski
              • Votes:
                0 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: