Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-3987

Zeppelin 0.9.0 fail to access Notebooks from HDFS

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.0
    • None
    • None
    • None
    • Cloudera 6.1

      Spark 2.4

      Hadoop 3.0

      Shiro, LDAP (But Hadoop is non-secure without any Kerberos)

    Description

      Hi,

      I have built Zeppelin-0.9.0-SNAPSHOT and copied my configs from previous version 0.8.2 into this new directory. Usually, all the versions after 0.8.0 (0.8.1, 0.8.2) immediately after start will fetch all the notebooks from HDFS. However, in 0.9.0 the UI is empty and the logs also indicate the reading Notebooks did not happen. 

      <property> <name>zeppelin.notebook.storage</name> <value>org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo</value> <description>hadoop compatible file system notebook persistence layer implementation</description> </property>
      
      <property> <name>zeppelin.notebook.dir</name> <value>hdfs://hadoop-master-1:8020/user/zeppelin/notebook</value> <description>path or URI for notebook persist</description> </property>
      

      zeppelin-env.sh:

      export ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
      export ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '
      
      export JAVA_HOME=/usr/lib/jvm/java-8-oracle/
      export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
      export SPARK_CONF_DIR=$SPARK_HOME/conf
      export HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf
      
      export PYSPARK_DRIVER_PYTHON=/opt/cloudera/parcels/Anaconda/envs/py36/bin/python3
      export PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda/envs/py36/bin/python3
      export PYTHONPATH=/opt/cloudera/parcels/Anaconda/envs/py36/bin/python3
      
      export SPARK_SUBMIT_OPTIONS="--jars hdfs:///user/maziyar/jars/zeppelin/graphframes/graphframes-assembly-0.7.0-spark2.3-SNAPSHOT.jar
      
      

      The startup logs: 

      INFO [2019-02-03 17:55:41,797] ({main} ZeppelinConfiguration.java[create]:127) - Load configuration from file:/opt/zeppelin-0.9.0-SNAPSHOT/conf/zeppelin-site.xml
      INFO [2019-02-03 17:55:41,856] ({main} ZeppelinConfiguration.java[create]:135) - Server Host: 0.0.0.0
      INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:137) - Server Port: 8080
      INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:141) - Context Path: /
      INFO [2019-02-03 17:55:41,857] ({main} ZeppelinConfiguration.java[create]:142) - Zeppelin Version: 0.9.0-SNAPSHOT
      INFO [2019-02-03 17:55:41,876] ({main} Log.java[initialized]:193) - Logging initialized @440ms to org.eclipse.jetty.util.log.Slf4jLog
      WARN [2019-02-03 17:55:41,994] ({main} ServerConnector.java[setSoLingerTime]:458) - Ignoring deprecated socket close linger time
      INFO [2019-02-03 17:55:42,064] ({main} ZeppelinServer.java[setupWebAppContext]:403) - ZeppelinServer Webapp path: /opt/zeppelin-0.9.0-SNAPSHOT/webapps
      WARN [2019-02-03 17:55:42,223] ({main} NotebookAuthorization.java[getInstance]:79) - Notebook authorization module was called without initialization, initializing with default configuration
      WARN [2019-02-03 17:55:42,225] ({main} ZeppelinConfiguration.java[getConfigFSDir]:545) - zeppelin.config.fs.dir is not specified, fall back to local conf directory zeppelin.conf.dir
      WARN [2019-02-03 17:55:42,225] ({main} ZeppelinConfiguration.java[getConfigFSDir]:545) - zeppelin.config.fs.dir is not specified, fall back to local conf directory zeppelin.conf.dir
      INFO [2019-02-03 17:55:42,225] ({main} LocalConfigStorage.java[loadNotebookAuthorization]:84) - Load notebook authorization from file: /opt/zeppelin-0.9.0-SNAPSHOT/conf/notebook-authorization.json
      INFO [2019-02-03 17:55:42,279] ({main} Credentials.java[loadFromFile]:121) - /opt/zeppelin-0.9.0-SNAPSHOT/conf/credentials.json
      INFO [2019-02-03 17:55:42,350] ({main} NotebookServer.java[<init>]:145) - NotebookServer instantiated: org.apache.zeppelin.socket.NotebookServer@ae13544
      INFO [2019-02-03 17:55:42,350] ({main} NotebookServer.java[setServiceLocator]:150) - Injected ServiceLocator: ServiceLocatorImpl(shared-locator,0,1089504328)
      INFO [2019-02-03 17:55:42,351] ({main} NotebookServer.java[setNotebook]:156) - Injected NotebookProvider
      INFO [2019-02-03 17:55:42,353] ({main} NotebookServer.java[setNotebookService]:163) - Injected NotebookServiceProvider
      INFO [2019-02-03 17:55:42,359] ({main} ZeppelinServer.java[main]:233) - Starting zeppelin server
      INFO [2019-02-03 17:55:42,361] ({main} Server.java[doStart]:370) - jetty-9.4.14.v20181114; built: 2018-11-14T21:20:31.478Z; git: c4550056e785fb5665914545889f21dc136ad9e6; jvm 1.8.0_201-b09
      INFO [2019-02-03 17:55:44,696] ({main} StandardDescriptorProcessor.java[visitServlet]:283) - NO JSP Support for /, did not find org.eclipse.jetty.jsp.JettyJspServlet
      INFO [2019-02-03 17:55:44,711] ({main} DefaultSessionIdManager.java[doStart]:365) - DefaultSessionIdManager workerName=node0
      INFO [2019-02-03 17:55:44,711] ({main} DefaultSessionIdManager.java[doStart]:370) - No SessionScavenger set, using defaults
      INFO [2019-02-03 17:55:44,713] ({main} HouseKeeper.java[startScavenging]:149) - node0 Scavenging every 660000ms
      INFO [2019-02-03 17:55:44,720] ({main} ContextHandler.java[log]:2345) - Initializing Shiro environment
      INFO [2019-02-03 17:55:44,720] ({main} EnvironmentLoader.java[initEnvironment]:133) - Starting Shiro environment initialization.
      INFO [2019-02-03 17:55:45,078] ({main} IniRealm.java[processDefinitions]:188) - IniRealm defined, but there is no [users] section defined. This realm will not be populated with any users and it is assumed that they will be populated programatically. Users must be defined for this Realm instance to be useful.
      INFO [2019-02-03 17:55:45,078] ({main} IniSecurityManagerFactory.java[isAutoApplyRealms]:127) - Realms have been explicitly set on the SecurityManager instance - auto-setting of realms will not occur.
      INFO [2019-02-03 17:55:45,082] ({main} EnvironmentLoader.java[initEnvironment]:147) - Shiro environment initialized in 361 ms.
      INFO [2019-02-03 17:55:46,010] ({main} ContextHandler.java[doStart]:855) - Started o.e.j.w.WebAppContext@38e79ae3{zeppelin-web,/,file:///opt/zeppelin-0.9.0-SNAPSHOT/webapps/webapp/,AVAILABLE}{/opt/zeppelin-0.9.0-SNAPSHOT/zeppelin-web-0.9.0-SNAPSHOT.war}
      INFO [2019-02-03 17:55:46,027] ({main} AbstractConnector.java[doStart]:292) - Started ServerConnector@7a18e8d{HTTP/1.1,[http/1.1]}{0.0.0.0:8080}
      INFO [2019-02-03 17:55:46,027] ({main} Server.java[doStart]:407) - Started @4593ms
      INFO [2019-02-03 17:55:46,027] ({main} ZeppelinServer.java[main]:243) - Done, zeppelin server started
      

      Built by

      mvn clean package -Pbuild-distr -DskipTests -Pspark-2.4 -Pscala-2.11 -pl '!beam'
      

      And 

      mvn clean package -Pbuild-distr -DskipTests -Dhadoop3 -Pspark-2.4 -Pscala-2.11 -pl '!beam'
      

      With the same result of missing Notebooks. If I start my Zeppelin 0.8.x with the same conf directory, all the notebooks are there.

       

      I have tried to look and see what has changed between 0.8.x and 0.9.0 in terms of configs and I couldn't find anything. The only thing that strikes me is zeppelin-plugins which now contains all notebook and luncher related operations. 

      UPDATE1: One more thing, in 0.9.0 in configuration UI I see these:

       

      zeppelin.conf.dir /opt/zeppelin-0.9.0-SNAPSHOT/conf
      zeppelin.config.fs.dir  
      zeppelin.config.storage.class org.apache.zeppelin.storage.LocalConfigStorage
      zeppelin.notebook.storage org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo

       

      The storage class doesn't make any sense since I have set FileSystemNotebookRepo in zeppelin-site.xml unless in Zeppelin 0.9.0 this should be set somewhere else?

       

      UPDATE2: These are not in the startup, but if I restart and refresh the UI before asking me to log in again, it shows me lots of these messages:

      WARN [2019-02-03 19:26:59,853] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CV3SZ914/note.json
      WARN [2019-02-03 19:26:59,854] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CWHU6GRQ/note.json
      WARN [2019-02-03 19:26:59,855] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CWQUCDWK/note.json
      WARN [2019-02-03 19:26:59,856] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CWYANJVB/note.json
      WARN [2019-02-03 19:26:59,857] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CX1WPHJ4/note.json
      WARN [2019-02-03 19:26:59,858] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CYEG7K94/note.json
      WARN [2019-02-03 19:26:59,860] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CYF8MENQ/note.json
      WARN [2019-02-03 19:26:59,861] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CYV4ESHQ/note.json
      WARN [2019-02-03 19:26:59,862] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2CZQ5HSM3/note.json
      WARN [2019-02-03 19:26:59,863] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2D22Y6173/note.json
      WARN [2019-02-03 19:26:59,864] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2D2AQFX6U/note.json
      WARN [2019-02-03 19:26:59,866] ({qtp415138788-25} FileSystemStorage.java[collectNoteFiles]:146) - Unknown file: hdfs://hadoop-master-1:8020/user/zeppelin/not
      ebook/2D2E7UP3T/note.json
      

      The addresses are the exact path to the notes. 

       The error is here: 

      https://github.com/apache/zeppelin/blob/a2a621595afcc65e965382d5a0412ade0e299610/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/FileSystemStorage.java#L137

       

      if (path.getPath().getName().endsWith(".zpln"))

       

      What is ".zpln" extension? Is this new in 0.9.0? 

      UPDATE3:

      Converting old notes to new style didn't work:

       

      INFO [2019-02-03 19:55:59,376] ({qtp415138788-320} PluginManager.java[loadNotebookRepo]:60) - Loading NotebookRepo Plugin: org.apache.zeppelin.notebook.repo.FileSyste
      mNotebookRepo
      INFO [2019-02-03 19:55:59,394] ({qtp415138788-320} FileSystemNotebookRepo.java[init]:50) - Creating FileSystem: org.apache.hadoop.hdfs.DistributedFileSystem
      INFO [2019-02-03 19:55:59,395] ({qtp415138788-320} FileSystemNotebookRepo.java[init]:52) - Using folder hdfs://hadoop-master-1:8020/user/zeppelin/notebook to
      store notebook
      INFO [2019-02-03 19:55:59,413] ({qtp415138788-320} PluginManager.java[loadOldNotebookRepo]:105) - Loading OldNotebookRepo Plugin: org.apache.zeppelin.notebook.repo.Fi
      leSystemNotebookRepo
      INFO [2019-02-03 19:55:59,435] ({qtp415138788-320} OldFileSystemNotebookRepo.java[init]:39) - Creating FileSystem: org.apache.hadoop.hdfs.DistributedFileSystem for Ze
      ppelin Notebook.
      INFO [2019-02-03 19:55:59,435] ({qtp415138788-320} OldFileSystemNotebookRepo.java[init]:42) - Using folder hdfs://hadoop-master-1:8020/user/zeppelin/notebook
      to store notebook
      INFO [2019-02-03 19:55:59,503] ({qtp415138788-320} NotebookRepoSync.java[init]:99) - Convert old note file to new style, note count: 77
      WARN [2019-02-03 19:55:59,514] ({qtp415138788-320} HttpChannel.java[handleException]:590) - /api/login
      javax.servlet.ServletException: A MultiException has 6 exceptions. They are:
      1. com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 131 column 2156
      2. java.lang.IllegalStateException: Unable to perform operation: create on org.apache.zeppelin.notebook.repo.NotebookRepoSync
      3. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of org.apache.zeppelin.notebook.Notebook errors were found
      4. java.lang.IllegalStateException: Unable to perform operation: resolve on org.apache.zeppelin.notebook.Notebook
      5. java.lang.IllegalArgumentException: While attempting to resolve the dependencies of org.apache.zeppelin.rest.LoginRestApi errors were found
      6. java.lang.IllegalStateException: Unable to perform operation: resolve on org.apache.zeppelin.rest.LoginRestApi
      
      at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:432)
      at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
      at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
      at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
      at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
      at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:867)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1623)
      at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
      at org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
      at org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
      at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
      at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
      at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:449)
      at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:365)
      at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
      at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
      at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:387)
      at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:362)
      at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:125)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
      at org.apache.zeppelin.server.CorsFilter.doFilter(CorsFilter.java:64)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1610)
      at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
      at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
      at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
      at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
      at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
      at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
      at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
      at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
      at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
      at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
      at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
      at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
      at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
      at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
      at org.eclipse.jetty.server.Server.handle(Server.java:502)
      at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
      at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
      at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
      at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
      at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
      at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
      at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
      at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
      at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
      at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
      at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
      at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: A MultiException has 6 exceptions.

       

      Last update: I just looked, I do have some notes with .zpln format. But still, they are not being displayed and the UI is empty.

       

      Thanks.

      Attachments

        1. Screenshot 2019-02-03 17.59.35.png
          267 kB
          Maziyar PANAHI

        Activity

          People

            zjffdu Jeff Zhang
            maziyar Maziyar PANAHI
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: