Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-40442

Unstable Spark history server: DB is closed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.2.2
    • None
    • Web UI
    • None

    Description

      Since we upgraded our spark history server to 3.2.2, it has been unstable. We get following log lines continuously.

      2022-09-15 08:54:57,000 WARN /api/v1/applications/application_xxxx/1/executors javax.servlet.ServletException: java.lang.IllegalStateException: DB is closed. at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:410) at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205) at org.sparkproject.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) at org.sparkproject.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1631) at org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95) at org.sparkproject.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.sparkproject.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at org.sparkproject.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) at org.sparkproject.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.sparkproject.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.sparkproject.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.sparkproject.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349) at org.sparkproject.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.sparkproject.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:763) at org.sparkproject.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234) at org.sparkproject.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.sparkproject.jetty.server.Server.handle(Server.java:516) at org.sparkproject.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:400) at org.sparkproject.jetty.server.HttpChannel.dispatch(HttpChannel.java:645) at org.sparkproject.jetty.server.HttpChannel.handle(HttpChannel.java:392) at org.sparkproject.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) at org.sparkproject.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.sparkproject.jetty.io.FillInterest.fillable(FillInterest.java:105) at org.sparkproject.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: DB is closed. at org.apache.spark.util.kvstore.LevelDB.db(LevelDB.java:364) at org.apache.spark.util.kvstore.LevelDBIterator.(LevelDBIterator.java:51) at org.apache.spark.util.kvstore.LevelDB$1.iterator(LevelDB.java:253) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:60) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.status.AppStatusStore.executorList(AppStatusStore.scala:92) at org.apache.spark.deploy.history.HistoryAppStatusStore.executorList(HistoryAppStatusStore.scala:46) at org.apache.spark.status.api.v1.AbstractApplicationResource.$anonfun$executorList$1(OneApplicationResource.scala:53) at org.apache.spark.status.api.v1.BaseAppResource.$anonfun$withUI$1(ApiRootResource.scala:142) at org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:121) at org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:133) at org.apache.spark.status.api.v1.BaseAppResource.withUI(ApiRootResource.scala:137) at org.apache.spark.status.api.v1.BaseAppResource.withUI$(ApiRootResource.scala:135) at org.apache.spark.status.api.v1.AbstractApplicationResource.withUI(OneApplicationResource.scala:32) at org.apache.spark.status.api.v1.AbstractApplicationResource.executorList(OneApplicationResource.scala:53) at sun.reflect.GeneratedMethodAccessor241.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) at org.glassfish.jersey.internal.Errors.process(Errors.java:292) at org.glassfish.jersey.internal.Errors.process(Errors.java:274) at org.glassfish.jersey.internal.Errors.process(Errors.java:244) at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234) at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680) at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394) ... 36 more 2022-09-15 08:54:57,000 WARN unhandled due to prior sendError javax.servlet.ServletException: java.lang.IllegalStateException: DB is closed. at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:410) at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:366) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:319) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205) at org.sparkproject.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) at org.sparkproject.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1631) at org.apache.spark.ui.HttpSecurityFilter.doFilter(HttpSecurityFilter.scala:95) at org.sparkproject.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) at org.sparkproject.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) at org.sparkproject.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) at org.sparkproject.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) at org.sparkproject.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) at org.sparkproject.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) at org.sparkproject.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) at org.sparkproject.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1349) at org.sparkproject.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.sparkproject.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:763) at org.sparkproject.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:234) at org.sparkproject.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) at org.sparkproject.jetty.server.Server.handle(Server.java:516) at org.sparkproject.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:400) at org.sparkproject.jetty.server.HttpChannel.dispatch(HttpChannel.java:645) at org.sparkproject.jetty.server.HttpChannel.handle(HttpChannel.java:392) at org.sparkproject.jetty.server.HttpConnection.onFillable(HttpConnection.java:277) at org.sparkproject.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) at org.sparkproject.jetty.io.FillInterest.fillable(FillInterest.java:105) at org.sparkproject.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) at org.sparkproject.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) at org.sparkproject.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) at org.sparkproject.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) at org.sparkproject.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IllegalStateException: DB is closed. at org.apache.spark.util.kvstore.LevelDB.db(LevelDB.java:364) at org.apache.spark.util.kvstore.LevelDBIterator.(LevelDBIterator.java:51) at org.apache.spark.util.kvstore.LevelDB$1.iterator(LevelDB.java:253) at scala.collection.convert.Wrappers$JIterableWrapper.iterator(Wrappers.scala:60) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.status.AppStatusStore.executorList(AppStatusStore.scala:92) at org.apache.spark.deploy.history.HistoryAppStatusStore.executorList(HistoryAppStatusStore.scala:46) at org.apache.spark.status.api.v1.AbstractApplicationResource.$anonfun$executorList$1(OneApplicationResource.scala:53) at org.apache.spark.status.api.v1.BaseAppResource.$anonfun$withUI$1(ApiRootResource.scala:142) at org.apache.spark.deploy.history.ApplicationCache.withSparkUI(ApplicationCache.scala:121) at org.apache.spark.deploy.history.HistoryServer.withSparkUI(HistoryServer.scala:133) at org.apache.spark.status.api.v1.BaseAppResource.withUI(ApiRootResource.scala:137) at org.apache.spark.status.api.v1.BaseAppResource.withUI$(ApiRootResource.scala:135) at org.apache.spark.status.api.v1.AbstractApplicationResource.withUI(OneApplicationResource.scala:32) at org.apache.spark.status.api.v1.AbstractApplicationResource.executorList(OneApplicationResource.scala:53) at sun.reflect.GeneratedMethodAccessor241.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:124) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:167) at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:79) at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:475) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:397) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:255) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) at org.glassfish.jersey.internal.Errors.process(Errors.java:292) at org.glassfish.jersey.internal.Errors.process(Errors.java:274) at org.glassfish.jersey.internal.Errors.process(Errors.java:244) at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:234) at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:680) at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394) ... 36 more

      The history server stops responding for our users.

       

      Spark version: 3.2.2

      daemon memory set to: 32768 MB

      Cluster is kerberised.

      spark.history.fs.cleaner.enabled: true
      spark.history.fs.cleaner.interval: 1d
      spark.history.fs.cleaner.maxAge: 14d
      spark.history.fs.logDirectory: hdfs:///xxxx/
      spark.history.kerberos.keytab: /xxx/xxx.keytab
      spark.history.kerberos.principal xxx/xxx@XXXX.com
      spark.history.provider org.apache.spark.deploy.history.FsHistoryProvider
      spark.history.store.path /local_xxxxx
      spark.history.ui.port 34000

      Attachments

        Activity

          People

            Unassigned Unassigned
            santosh.pingale Santosh Pingale
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: