Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9830

Once IndexWriter is closed due to some RunTimeException like FileSystemException, It never return to normal unless restart the Solr JVM

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 6.2
    • Fix Version/s: None
    • Component/s: update
    • Security Level: Public (Default Security Level. Issues are Public)
    • Labels:
      None
    • Environment:

      Red Hat 4.4.7-3,SolrCloud

      Description

      1. Collection coll_test, has 9 shards, each has two replicas in different solr instances.
      2. When update documens to the collection use Solrj, inject the exhausted handle fault to one solr instance like solr1.
      3. Update to col_test_shard3_replica1(It's leader) is failed due to FileSystemException, and IndexWriter is closed.
      4. And clear the fault, the col_test_shard3_replica1 (is leader) is always cannot be updated documens and the numDocs is always less than the standby replica.
      5. After Solr instance restart, It can update documens and the numDocs is consistent between the two replicas.

      I think in this case in Solr Cloud mode, it should recovery itself and not restart to recovery the solrcore update function.

      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [DWPT][http-nio-21101-exec-20]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit exception updating document | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside updateDocument | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: all running merges have aborted | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: done finish merges | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 numDocs=3798 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: now abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DWPT][commitScheduler-46-thread-1]: done abort | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [DW][http-nio-21101-exec-20]: done abort success=true | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 finishFullFlush success=false | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | [IW][http-nio-21101-exec-20]: rollback: infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34)
      2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 | [col_test_shard3_replica1] webapp=/solr path=/update params=

      {wt=javabin&version=2} {add=[5____5 (1552493084330164224), 24____5 (1552493084330164225), 28____5 (1552493084331212800), 32____5 (1552493084331212801), 44____5 (1552493084331212802), 46____5 (1552493084331212803), 64____5 (1552493084331212804), 94____5 (1552493084331212805), 100____5 (1552493084331212806), 119____5 (1552493084331212807), ... (74 adds)]}

      0 43 | org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:187)

      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
      at org.apache.solr.core.SolrCore.execute(SolrCore.java:2143)
      at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:695)
      at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:471)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:450)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:400)
      at org.apache.solr.servlet.SolrAuthorizationFilter.doFilter(SolrAuthorizationFilter.java:195)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.check.SolrParaCheckFilter.doFilter(SolrParaCheckFilter.java:201)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.audit.AuditFilter.doFilter(AuditFilter.java:145)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:611)
      at com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:578)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.HttpServletRequestWrapperFilterWrapper.doFilter(HttpServletRequestWrapperFilterWrapper.java:37)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.Cas20ProxyReceivingTicketValidationFilterWrapper.doFilter(Cas20ProxyReceivingTicketValidationFilterWrapper.java:71)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.Cas20AuthenticationFilterWrapper.doFilter(Cas20AuthenticationFilterWrapper.java:60)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.cas.LogoutFilter.doFilter(LogoutFilter.java:84)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.monitor.MemMonitorFilter.doFilter(MemMonitorFilter.java:81)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.ServerRealmFilter.doFilter(ServerRealmFilter.java:55)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at com.huawei.solr.security.auth.RerouteRequestFilter.doFilter(RerouteRequestFilter.java:58)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
      at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
      at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
      at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)
      at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
      at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
      at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
      at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442)
      at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1083)
      at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640)
      at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756)
      at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740)
      at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:754)
      at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1558)
      at org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:279)
      at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211)
      at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
      ... 73 more
      Caused by: java.nio.file.FileSystemException: /srv/BigData/solr/solrserveradmin/col_test_shard3_replica1/data/index/_4ha.fdx: Too many open files in system
      at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
      at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
      at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
      at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
      at java.nio.file.Files.newOutputStream(Files.java:216)
      at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413)
      at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409)
      at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253)
      at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44)
      at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43)
      at org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:108)
      at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128)
      at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183)
      at org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:83)
      at org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:331)
      at org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:368)
      at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231)
      at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:478)
      at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1562)
      ... 76 more

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                daisy_yu Daisy.Yuan
              • Votes:
                2 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated: