Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Invalid
-
0.7.1
-
None
-
None
Description
I am using zeppelin 0.7.1 with livy-0.4-snapshot. When I edit the livy interpreter setting related to Spark resource in zeppelin web ui. I would get the following error from yarn application master.
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.xerces.dom.DeferredDocumentImpl.getNodeObject(Unknown Source)
at org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren(Unknown Source)
at org.apache.xerces.dom.DeferredElementNSImpl.synchronizeChildren(Unknown Source)
at org.apache.xerces.dom.ParentNode.hasChildNodes(Unknown Source)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2551)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2444)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:968)
at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:987)
at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1388)
at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:70)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:272)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:311)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:55)
at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at java.lang.Class.newInstance(Class.java:442)
at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412)
at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
It turned out the above error is caused from mismatch in zeppelin-livy interpreter configuration with livy server configuration.
In zeppelin logs, I can see zeppelin is posting the following json to livy server:
DEBUG [2017-05-17 11:29:39,821] (
{pool-2-thread-9} HttpAccessor.java[createRequest]:79) - Created POST request for "http://10.204.11.182:8998/sessions"DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9}
RestTemplate.java[doWithRequest]:746) - Setting request Accept header to [text/plain, application/json, application/*+json, */*]
DEBUG [2017-05-17 11:29:39,821] (
RestTemplate.java[doWithRequest]:841) - Writing [{
"kind": "pyspark",
"proxyUser": "",
"conf":
However, from https://github.com/cloudera/livy, livy server accept configurations like the following:
driverMemory Amount of memory to use for the driver process string
driverCores Number of cores to use for the driver process int
executorMemory Amount of memory to use per executor process string
executorCores Number of cores to use for each executor int
numExecutors Number of executors to launch for this session int
archives Archives to be used in this session List of string
If I leave all the spark resource config empty on zeppelin web ui, zeppelin would posting the following json to livy which can be successfully executed.
DEBUG [2017-05-17 15:40:35,748] (
DEBUG [2017-05-17 15:40:35,751] ({pool-2-thread-2}
RestTemplate.java[doWithRequest]:746) - Setting request Accept header to [text/plain, application/json, application/*+json, */*]
DEBUG [2017-05-17 15:40:35,752] (
RestTemplate.java[doWithRequest]:841) - Writing [{
"kind": "spark",
"proxyUser": "",
"conf": {}
It's obvious that there is mismatch between zeppelin and livy related to spark resource specification. I am not sure whether Zeppelin or Livy should fix this.