Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-2558

Livy configuration mismatch

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Invalid
    • 0.7.1
    • None
    • livy-interpreter
    • None

    Description

      I am using zeppelin 0.7.1 with livy-0.4-snapshot. When I edit the livy interpreter setting related to Spark resource in zeppelin web ui. I would get the following error from yarn application master.

      Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
      at org.apache.xerces.dom.DeferredDocumentImpl.getNodeObject(Unknown Source)
      at org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren(Unknown Source)
      at org.apache.xerces.dom.DeferredElementNSImpl.synchronizeChildren(Unknown Source)
      at org.apache.xerces.dom.ParentNode.hasChildNodes(Unknown Source)
      at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2551)
      at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2444)
      at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361)
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:968)
      at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:987)
      at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1388)
      at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:70)
      at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:272)
      at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:311)
      at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:55)
      at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
      at java.lang.Class.newInstance(Class.java:442)
      at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
      at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
      at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412)
      at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437)
      at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747)
      at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

      It turned out the above error is caused from mismatch in zeppelin-livy interpreter configuration with livy server configuration.
      In zeppelin logs, I can see zeppelin is posting the following json to livy server:

      DEBUG [2017-05-17 11:29:39,821] (

      {pool-2-thread-9} HttpAccessor.java[createRequest]:79) - Created POST request for "http://10.204.11.182:8998/sessions"
      DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9}

      RestTemplate.java[doWithRequest]:746) - Setting request Accept header to [text/plain, application/json, application/*+json, */*]
      DEBUG [2017-05-17 11:29:39,821] (

      {pool-2-thread-9}

      RestTemplate.java[doWithRequest]:841) - Writing [{
      "kind": "pyspark",
      "proxyUser": "",
      "conf":

      { "spark.executor.memory": "2", "spark.driver.memory": "4", "spark.driver.cores": "1", "spark.executor.cores": "1", "spark.executor.instances": "10" }

      However, from https://github.com/cloudera/livy, livy server accept configurations like the following:
      driverMemory Amount of memory to use for the driver process string
      driverCores Number of cores to use for the driver process int
      executorMemory Amount of memory to use per executor process string
      executorCores Number of cores to use for each executor int
      numExecutors Number of executors to launch for this session int
      archives Archives to be used in this session List of string

      If I leave all the spark resource config empty on zeppelin web ui, zeppelin would posting the following json to livy which can be successfully executed.
      DEBUG [2017-05-17 15:40:35,748] (

      {pool-2-thread-2} HttpAccessor.java[createRequest]:79) - Created POST request for "http://10.204.11.183:8998/sessions"
      DEBUG [2017-05-17 15:40:35,751] ({pool-2-thread-2}

      RestTemplate.java[doWithRequest]:746) - Setting request Accept header to [text/plain, application/json, application/*+json, */*]
      DEBUG [2017-05-17 15:40:35,752] (

      {pool-2-thread-2}

      RestTemplate.java[doWithRequest]:841) - Writing [{
      "kind": "spark",
      "proxyUser": "",
      "conf": {}

      It's obvious that there is mismatch between zeppelin and livy related to spark resource specification. I am not sure whether Zeppelin or Livy should fix this.

      Attachments

        Activity

          People

            Unassigned Unassigned
            why heyang wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: