[ZEPPELIN-3633] ZeppelinContext Not Found in yarn-cluster Mode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.8.0
Fix Version/s: 0.8.1, 0.9.0
Component/s: zeppelin-interpreter
Labels:
None

Description

Hi all, Thanks for the 0.8.0 release!

We’re keen to take advantage of the yarn-cluster support to take the pressure off our Zeppelin host. However, I am having some trouble with it. The first problem was in following the documentation here:

https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html

This suggests that we need to modify the master configuration from “yarn-client” to “yarn-cluster”.

However, doing so results in the following error:
Warning: Master yarn-cluster is deprecated since 2.0. Please use master “yarn” with specified deploy mode instead.
Error: Client deploy mode is not compatible with master “yarn-cluster” Run with --help for usage help or --verbose for debug output
<stacktrace>

I got past this error with the following settings:
master = yarn
spark.submit.deployMode = cluster

I’m somewhat unclear if I’m straying from the correct (documented) configuration or if the documentation needs an update. Anyway; These settings appear to work for everything except the ZeppelinContext, which is missing.
Code:
%spark
z

Output:
<console>:24: error: not found: value z

Using yarn-client mode I can identify that z is meant to be an instance of org.apache.zeppelin.spark.SparkZeppelinContext
Code:
%spark
z

Output:
res4: org.apache.zeppelin.spark.SparkZeppelinContext = org.apache.zeppelin.spark.SparkZeppelinContext@5b9282e1

However, this class is absent in cluster-mode:
Code: %spark
org.apache.zeppelin.spark.SparkZeppelinContext

Output:
<console>:24: error: object zeppelin is not a member of package org.apache org.apache.zeppelin.spark.SparkZeppelinContext
^

Snooping around the Zeppelin installation I was able to locate this class in ${ZEPPELIN_INSTALL_DIR}/interpreter/spark/spark-interpreter-0.8.0.jar. I then uploaded this jar to HDFS and added it to spark.jars & spark.driver.extraClassPath. Relevant entries in driver log:
…
Added JAR hdfs:/spark-interpreter-0.8.0.jar at hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar with timestamp 1531732774379
…
CLASSPATH -> …:hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar …
…
command: … file:$PWD/spark-interpreter-0.8.0.jar \
etc.

However, I still can’t use the ZeppelinContext or org.apache.zeppelin.spark.SparkZeppelinContext class. At this point I’ve run out of ideas and am willing to ask for help.

Does anyone have thoughts on how I could use the ZeppelinContext in yarn cluster mode?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SparkClusterWithDepsSettings.png
26/Jul/18 10:43
37 kB
Chris Penny
SparkClusterSettings.png
26/Jul/18 10:43
39 kB
Chris Penny
SparkClient.log
26/Jul/18 10:43
32 kB
Chris Penny
SparkClusterWithDeps.txt
26/Jul/18 10:43
63 kB
Chris Penny
SparkClientSettings.png
26/Jul/18 10:43
44 kB
Chris Penny
ZeppelinSparkClusterWithDepsExecutedParagraphs.png
26/Jul/18 10:43
23 kB
Chris Penny
SparkCluster.log
26/Jul/18 10:43
66 kB
Chris Penny
ZeppelinSparkClientClusterExecutedParagraphs.png
26/Jul/18 10:43
51 kB
Chris Penny
Screen Shot 2018-07-27 at 8.39.14 AM.png
27/Jul/18 00:40
77 kB
Jeff Zhang
ZeppelinMasterYarnClusterTest.png
27/Jul/18 08:07
11 kB
Chris Penny
ZeppelinMasterYarnClusterConfig.png
27/Jul/18 08:07
27 kB
Chris Penny
zeppelin-master-branch-yarn.log
27/Jul/18 08:07
45 kB
Chris Penny

Issue Links

blocks

ZEPPELIN-3629 Release 0.8.1

Closed

links to

GitHub Pull Request #3181

Activity

People

Assignee:: Jeff Zhang

Reporter:: Chris Penny

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 17/Jul/18 23:40

Updated:: 21/Aug/20 01:57

Resolved:: 26/Sep/18 00:53