Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-1332

Removing spark-dependencies

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Why?
      The latest version of Zeppelin whole package size is over 500MB. More and more interpreters are added, the size becomes bigger. Comparing to Spark binary package size(spark-2.0.0-bin-hadoop2.7.tgz is 178MB & spark-2.0.0-bin-without-hadoop.tzg is 109MB), Zeppelin package size is quite huge. And many Spark interpreter users are using their own Spark not Zeppelin's embedded one. So they don't need to include spark-dependencies. Actually the first possibility was suggested in PR#1115 by jongyoul regarding this issue.

      New suggestion
      I know Zeppelin's embedded Spark is very useful to Zeppelin beginner. Because they don't need to download Spark or set SPARK_HOME by themselves when they want to use Spark interpreter in Zeppelin. So I would like to suggest to download Spark binary package(maybe spark-2.0.0-bin-hadoop2.7.tgz?) from mirror site using shell script instead of just removing spark-dependencies/pom.xml. Please see the attached flow chart image.

      Attachments

        Issue Links

          Activity

            People

              Ahyoung Ahyoung Ryu
              Ahyoung Ahyoung Ryu
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: