Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9830

Once IndexWriter is closed due to some RunTimeException like FileSystemException, It never return to normal unless restart the Solr JVM

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 6.2
    • None
    • update
    • None
    • Red Hat 4.4.7-3,SolrCloud

    Description

      1. Collection coll_test, has 9 shards, each has two replicas in different solr instances.
      2. When update documens to the collection use Solrj, inject the exhausted handle fault to one solr instance like solr1.
      3. Update to col_test_shard3_replica1(It's leader) is failed due to FileSystemException, and IndexWriter is closed.
      4. And clear the fault, the col_test_shard3_replica1 (is leader) is always cannot be updated documens and the numDocs is always less than the standby replica.
      5. After Solr instance restart, It can update documens and the numDocs is consistent between the two replicas.

      I think in this case in Solr Cloud mode, it should recovery itself and not restart to recovery the solrcore update function.

      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit exception updating document | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside updateDocument | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: all running merges have aborted | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: done finish merges | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 numDocs=3798 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: done abort success=true | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 finishFullFlush success=false | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 | [col_test_shard3_replica1] webapp=/solr path=/update params=

      {wt=javabin&version=2} {add=[5____5 (1552493084330164224), 24____5 (1552493084330164225), 28____5 (1552493084331212800), 32____5 (1552493084331212801), 44____5 (1552493084331212802), 46____5 (1552493084331212803), 64____5 (1552493084331212804), 94____5 (1552493084331212805), 100____5 (1552493084331212806), 119____5 (1552493084331212807), ... (74 adds)]}

      0 43 | org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:187)

      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:2143)
      at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:695)
      at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:471)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:450)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:400)
      at org.apache.solr.servlet.SolrAuthorizationFilter.doFilter(SolrAuthorizationFilter.java:195)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.check.SolrParaCheckFilter.doFilter(SolrParaCheckFilter.java:201)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.audit.AuditFilter.doFilter(AuditFilter.java:145)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:611)
      at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:578)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.HttpServletRequestWrapperFilterWrapper.doFilter(HttpServletRequestWrapperFilterWrapper.java:37)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.Cas20ProxyReceivingTicketValidationFilterWrapper.doFilter(Cas20ProxyReceivingTicketValidationFilterWrapper.java:71)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.Cas20AuthenticationFilterWrapper.doFilter(Cas20AuthenticationFilterWrapper.java:60)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.LogoutFilter.doFilter(LogoutFilter.java:84)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.monitor.MemMonitorFilter.doFilter(MemMonitorFilter.java:81)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.ServerRealmFilter.doFilter(ServerRealmFilter.java:55)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.RerouteRequestFilter.doFilter(RerouteRequestFilter.java:58)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
      at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
      at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442)
      at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1083)
      at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
      at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756)
      at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740)
      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:754)
      at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1558)
      at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:279)
      at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211)
      at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
      ... 73 more
      Caused by: java.nio.file.FileSystemException: /srv/BigData/solr/solrserveradmin/col_test_shard3_replica1/data/index/_4ha.fdx: Too many open files in system
      at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
      at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
      at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
      at java.nio.file.Files.newOutputStream(Files.java:216)
      at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413)
      at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409)
      at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
      at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
      at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
      at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:108)
      at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
      at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
      at org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:83)
      at org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:331)
      at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:368)
      at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
      at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:478)
      at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1562)
      ... 76 more

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              daisy_yu Daisy.Yuan
              Votes:
              2 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: