Solr
  1. Solr
  2. SOLR-6359

Allow customization of the number of records and logs kept by UpdateLog

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently UpdateLog hardcodes the number of logs and records it keeps, and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the records) in an heavily indexing setup, leading to full recovery even if Solr was just stopped and restarted.

      These values should be customizable (even if only present as expert options).

      1. SOLR-6359.patch
        15 kB
        Ramkumar Aiyengar

        Issue Links

          Activity

          Hide
          ASF GitHub Bot added a comment -

          GitHub user andyetitmoves opened a pull request:

          https://github.com/apache/lucene-solr/pull/83

          Customize number of logs and records to keep with UpdateLog

          Patch for SOLR-6359

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/bloomberg/lucene-solr trunk-customize-ulog-keep

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/lucene-solr/pull/83.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #83


          commit 66d8ced68fb30624ad32b47bab07c7766d8c7e64
          Author: Ramkumar Aiyengar <andyetitmoves@gmail.com>
          Date: 2014-08-10T01:07:28Z

          Customize number of logs and records to keep with UpdateLog


          Show
          ASF GitHub Bot added a comment - GitHub user andyetitmoves opened a pull request: https://github.com/apache/lucene-solr/pull/83 Customize number of logs and records to keep with UpdateLog Patch for SOLR-6359 You can merge this pull request into a Git repository by running: $ git pull https://github.com/bloomberg/lucene-solr trunk-customize-ulog-keep Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/83.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #83 commit 66d8ced68fb30624ad32b47bab07c7766d8c7e64 Author: Ramkumar Aiyengar <andyetitmoves@gmail.com> Date: 2014-08-10T01:07:28Z Customize number of logs and records to keep with UpdateLog
          Hide
          Ramkumar Aiyengar added a comment -

          I haven't currently added these options to the example configs in case they were too obscure, but could add them..

          Show
          Ramkumar Aiyengar added a comment - I haven't currently added these options to the example configs in case they were too obscure, but could add them..
          Hide
          Forest Soup added a comment -

          Is the patch only available for Solr 5.0? For Solr 4.7, can we apply the patch? Thanks!

          Show
          Forest Soup added a comment - Is the patch only available for Solr 5.0? For Solr 4.7, can we apply the patch? Thanks!
          Hide
          Ramkumar Aiyengar added a comment -

          You might have to resolve conflicts but yeah, nothing in there should be specific to 5.0..

          Show
          Ramkumar Aiyengar added a comment - You might have to resolve conflicts but yeah, nothing in there should be specific to 5.0..
          Hide
          Forest Soup added a comment -

          When could we get the official build with that patch in 4.x or 5.0?

          Show
          Forest Soup added a comment - When could we get the official build with that patch in 4.x or 5.0?
          Hide
          Forest Soup added a comment - - edited

          The "numRecordsToKeep" and "maxNumLogsToKeep" values should be in the <updateLog>.like below.
          <!-- Enables a transaction log, used for real-time get, durability, and
          and solr cloud replica recovery. The log can grow as big as
          uncommitted changes to the index, so use of a hard autoCommit
          is recommended (see below).
          "dir" - the target directory for transaction logs, defaults to the
          solr data directory. -->
          <updateLog>
          <str name="dir">$

          {solr.ulog.dir:}

          </str>
          <int name="numRecordsToKeep">10000</int>
          <int name="maxNumLogsToKeep">100</int>
          </updateLog>

          Show
          Forest Soup added a comment - - edited The "numRecordsToKeep" and "maxNumLogsToKeep" values should be in the <updateLog>.like below. <!-- Enables a transaction log, used for real-time get, durability, and and solr cloud replica recovery. The log can grow as big as uncommitted changes to the index, so use of a hard autoCommit is recommended (see below). "dir" - the target directory for transaction logs, defaults to the solr data directory. --> <updateLog> <str name="dir">$ {solr.ulog.dir:} </str> <int name="numRecordsToKeep">10000</int> <int name="maxNumLogsToKeep">100</int> </updateLog>
          Hide
          Forest Soup added a comment - - edited

          I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config:
          <updateLog>
          <str name="dir">$

          {solr.ulog.dir:}

          </str>
          <int name="numRecordsToKeep">10000</int>
          <int name="maxNumLogsToKeep">100</int>
          </updateLog>

          Show
          Forest Soup added a comment - - edited I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config: <updateLog> <str name="dir">$ {solr.ulog.dir:} </str> <int name="numRecordsToKeep">10000</int> <int name="maxNumLogsToKeep">100</int> </updateLog>
          Hide
          Forest Soup added a comment -

          it works but with some pre-condition: the 20% newest existing transaction log of the core to be recovered must be newer than the 20% oldest existing transaction log of the good core.

          Show
          Forest Soup added a comment - it works but with some pre-condition: the 20% newest existing transaction log of the core to be recovered must be newer than the 20% oldest existing transaction log of the good core.
          Hide
          Forest Soup added a comment -

          A full snapshot recovery does not clean the tlog of the core being recovered.

          Show
          Forest Soup added a comment - A full snapshot recovery does not clean the tlog of the core being recovered.
          Hide
          Forest Soup added a comment -

          The snapshot recovery does not clear tlog of the core being recovered. Is it an issue?

          Show
          Forest Soup added a comment - The snapshot recovery does not clear tlog of the core being recovered. Is it an issue?
          Hide
          Shalin Shekhar Mangar added a comment -

          The snapshot recovery does not clear tlog of the core being recovered. Is it an issue?

          No, that's fine. The last two transaction log references are always kept around in case a peer sync is needed again.

          Show
          Shalin Shekhar Mangar added a comment - The snapshot recovery does not clear tlog of the core being recovered. Is it an issue? No, that's fine. The last two transaction log references are always kept around in case a peer sync is needed again.
          Hide
          Forest Soup added a comment - - edited

          Thanks. But will there be this case?
          After a snapshot recovery of core A is done, the tlog is still out-of-date without any new records from recovery, and it's not cleared. And if the just recovered core(core A) taking the leader role, and another core(core C) is trying to recover from it. As A's tlog contains the old entries without newest ones, will the core C do a peersync only with the old records, but missing the newest ones?

          And I think the snapshot recovery is because there are too much difference between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog is no longer needed for peersync.

          Our testing shows the snapshot recovery does not clean tlog with below steps:
          1, Core A and core B are 2 replicas of a shard.
          2, Core A down, and core B took leader role. And it takes some updates and record them to its tlog.
          3, After A up, it will do recovery from B, and if the difference are too much, A will do snapshot pull recovery. And during the snapshot pull recovery, there is no other update comes in. After the snapshot pull recovery, the tlog of A is not updated, it still does NOT contain any most recent from B.
          And the tlog are still out-of-date, although the index of A is already updated.
          4, Core A down again, and core B still remain the leader role, and it takes some other updates and recore them to its tlog.
          5, After A up again, it will do recovery from B. But it found its tlog is still too old. So it will do a snapshot recovery again, which is not necessary.

          Do you agree? Thanks!

          Show
          Forest Soup added a comment - - edited Thanks. But will there be this case? After a snapshot recovery of core A is done, the tlog is still out-of-date without any new records from recovery, and it's not cleared. And if the just recovered core(core A) taking the leader role, and another core(core C) is trying to recover from it. As A's tlog contains the old entries without newest ones, will the core C do a peersync only with the old records, but missing the newest ones? And I think the snapshot recovery is because there are too much difference between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog is no longer needed for peersync. Our testing shows the snapshot recovery does not clean tlog with below steps: 1, Core A and core B are 2 replicas of a shard. 2, Core A down, and core B took leader role. And it takes some updates and record them to its tlog. 3, After A up, it will do recovery from B, and if the difference are too much, A will do snapshot pull recovery. And during the snapshot pull recovery, there is no other update comes in. After the snapshot pull recovery, the tlog of A is not updated, it still does NOT contain any most recent from B. And the tlog are still out-of-date, although the index of A is already updated. 4, Core A down again, and core B still remain the leader role, and it takes some other updates and recore them to its tlog. 5, After A up again, it will do recovery from B. But it found its tlog is still too old. So it will do a snapshot recovery again, which is not necessary. Do you agree? Thanks!
          Hide
          Ramkumar Aiyengar added a comment -

          Taking over this one as well. Mark/Shalin, see attached patch and let me know if this looks good..

          Show
          Ramkumar Aiyengar added a comment - Taking over this one as well. Mark/Shalin, see attached patch and let me know if this looks good..
          Hide
          ASF subversion and git services added a comment -

          Commit 1664825 from andyetitmoves@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1664825 ]

          SOLR-6359: Allow customization of the number of records and logs kept by UpdateLog

          Show
          ASF subversion and git services added a comment - Commit 1664825 from andyetitmoves@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1664825 ] SOLR-6359 : Allow customization of the number of records and logs kept by UpdateLog
          Hide
          ASF subversion and git services added a comment -

          Commit 1664826 from andyetitmoves@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1664826 ]

          SOLR-6359: Allow customization of the number of records and logs kept by UpdateLog

          Show
          ASF subversion and git services added a comment - Commit 1664826 from andyetitmoves@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1664826 ] SOLR-6359 : Allow customization of the number of records and logs kept by UpdateLog
          Hide
          ASF GitHub Bot added a comment -

          Github user andyetitmoves closed the pull request at:

          https://github.com/apache/lucene-solr/pull/83

          Show
          ASF GitHub Bot added a comment - Github user andyetitmoves closed the pull request at: https://github.com/apache/lucene-solr/pull/83
          Hide
          Forest Soup added a comment - - edited

          We have a SolrCloud with 5 solr servers of Solr 4.7.0. There are one collection with 80 shards(2 replicas per shard) on those 5 servers. And we made a patch by merge the code of this fix to 4.7.0 stream. And after applied the patch to our servers with the config changing uploaded to ZooKeeper, we did a restart on one of the 5 solr server, we met some issues on that server. Below is the details -
          The solrconfig.xml we changed:
          <updateLog>
          <str name="dir">$

          {solr.ulog.dir:}

          </str>
          <int name="numRecordsToKeep">10000</int>
          <int name="maxNumLogsToKeep">100</int>
          </updateLog>

          After we restarted one solr server without other 4 servers are running, we met below exceptions in the restarted one:
          ERROR - 2015-03-16 20:48:48.214; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: Exception writing document id Q049bGx0bWFpbDIxL089bGxwX3VzMQ==41703656!B68BF5EC5A4A650D85257E0A00724A3B to the index; possible analysis error.
          at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
          at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
          at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
          at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:703)
          at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:857)
          at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:556)
          at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96)
          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
          at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
          at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
          at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
          at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
          at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
          at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
          at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
          at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
          at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
          at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
          at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
          at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
          at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
          at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
          at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
          at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
          at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
          at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
          at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
          at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
          at java.lang.Thread.run(Thread.java:804)
          Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
          at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:645)
          at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:659)
          at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1525)
          at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:236)
          at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
          ... 37 more

          It looks like https://issues.apache.org/jira/browse/SOLR-4605, but I guess it's not the case..

          Is it due to txn log reply of the old log entries? Could you please help to explain the root cause of it and how to avoid it?

          Doing a rolling restart cannot solve the issue. So we have to do a full outage that stop all 5 solr servers, then start one, wait all cores become "active", then start another one.

          Do you have any better idea to get quick resolution of those failure?

          Thanks!

          Show
          Forest Soup added a comment - - edited We have a SolrCloud with 5 solr servers of Solr 4.7.0. There are one collection with 80 shards(2 replicas per shard) on those 5 servers. And we made a patch by merge the code of this fix to 4.7.0 stream. And after applied the patch to our servers with the config changing uploaded to ZooKeeper, we did a restart on one of the 5 solr server, we met some issues on that server. Below is the details - The solrconfig.xml we changed: <updateLog> <str name="dir">$ {solr.ulog.dir:} </str> <int name="numRecordsToKeep">10000</int> <int name="maxNumLogsToKeep">100</int> </updateLog> After we restarted one solr server without other 4 servers are running, we met below exceptions in the restarted one: ERROR - 2015-03-16 20:48:48.214; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: Exception writing document id Q049bGx0bWFpbDIxL089bGxwX3VzMQ==41703656!B68BF5EC5A4A650D85257E0A00724A3B to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:703) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:857) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:556) at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173) at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106) at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:804) Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:645) at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:659) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1525) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:236) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) ... 37 more It looks like https://issues.apache.org/jira/browse/SOLR-4605 , but I guess it's not the case.. Is it due to txn log reply of the old log entries? Could you please help to explain the root cause of it and how to avoid it? Doing a rolling restart cannot solve the issue. So we have to do a full outage that stop all 5 solr servers, then start one, wait all cores become "active", then start another one. Do you have any better idea to get quick resolution of those failure? Thanks!
          Hide
          Timothy Potter added a comment -

          Bulk close after 5.1 release

          Show
          Timothy Potter added a comment - Bulk close after 5.1 release
          Hide
          David Smiley added a comment -

          Maybe I misunderstand the impacts of these configuration options, but why even have a maxNumLogsToKeep? i.e. why isn't it effectively unlimited? I don't care how many internal log files the updateLog would like to do its implementation-detail business, so long as I can specify that it has docs added within the last X minutes, and maybe a minimum number of docs. Sounds reasonable? Because X minutes allows me to specify a server restart worth of time. That 'X' minutes is basically the hard auto commit interval, since that's what truncates the current log to a new file. Ramkumar Aiyengar in your "heavy indexing setup" couldn't you have just set the auto commit window large enough to your liking?

          The current "numRecordsToKeep" (defaulting to 100) doesn't say if it's a min or max; it seems to be implemented as a soft maximum – the oldest log files will be removed to stay under, but we'll always have at least one log file, however big or small it may be. In my scenario where I basically don't care how many records it actually is (I care about time), I think I can basically ignore this (leave at 100).

          Show
          David Smiley added a comment - Maybe I misunderstand the impacts of these configuration options, but why even have a maxNumLogsToKeep? i.e. why isn't it effectively unlimited? I don't care how many internal log files the updateLog would like to do its implementation-detail business, so long as I can specify that it has docs added within the last X minutes, and maybe a minimum number of docs. Sounds reasonable? Because X minutes allows me to specify a server restart worth of time. That 'X' minutes is basically the hard auto commit interval, since that's what truncates the current log to a new file. Ramkumar Aiyengar in your "heavy indexing setup" couldn't you have just set the auto commit window large enough to your liking? The current "numRecordsToKeep" (defaulting to 100) doesn't say if it's a min or max; it seems to be implemented as a soft maximum – the oldest log files will be removed to stay under, but we'll always have at least one log file, however big or small it may be. In my scenario where I basically don't care how many records it actually is (I care about time), I think I can basically ignore this (leave at 100).
          Hide
          Ramkumar Aiyengar added a comment -

          Yeah, you are right that you get better control by tweaking one vs the other. With num logs, you can get an approximation to amount of time for which you keep logs (ie number of commits).. I agree that's not exact, and it would be good to have the time itself as a config option.
          .
          The other option is to leave the num logs at unlimited, and tweak the number of records, which probably helps to protect against going berserk with log sizes if your indexing rates vary wildly. We do this in our setups, ie leave the number of files unlimited, while set the records to keep..

          Show
          Ramkumar Aiyengar added a comment - Yeah, you are right that you get better control by tweaking one vs the other. With num logs, you can get an approximation to amount of time for which you keep logs (ie number of commits).. I agree that's not exact, and it would be good to have the time itself as a config option. . The other option is to leave the num logs at unlimited, and tweak the number of records, which probably helps to protect against going berserk with log sizes if your indexing rates vary wildly. We do this in our setups, ie leave the number of files unlimited, while set the records to keep..
          Hide
          Yonik Seeley added a comment -

          Maybe I misunderstand the impacts of these configuration options, but why even have a maxNumLogsToKeep? i.e. why isn't it effectively unlimited?

          The reason for having a different number internally has to do with current implementation details and practical system limits. Each log file is kept open, hence the edge case where someone does add,commit,add,commit over and over will run the system out of file descriptors if num records to keep is high.

          Show
          Yonik Seeley added a comment - Maybe I misunderstand the impacts of these configuration options, but why even have a maxNumLogsToKeep? i.e. why isn't it effectively unlimited? The reason for having a different number internally has to do with current implementation details and practical system limits. Each log file is kept open, hence the edge case where someone does add,commit,add,commit over and over will run the system out of file descriptors if num records to keep is high.

            People

            • Assignee:
              Ramkumar Aiyengar
              Reporter:
              Ramkumar Aiyengar
            • Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development