Solr
  1. Solr
  2. SOLR-6776

Data lost when use SoftCommit and TLog

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.10
    • Fix Version/s: 4.10.3
    • Component/s: None

      Description

      We enabled update log and change autoCommit to some bigger value 10 mins.

      After restart, we push one doc with softCommit=true
      http://localhost:8983/solr/update?stream.body=<add><doc><field name="id">id1</field></doc></add>&softCommit=true

      Then we kill the java process after a min.

      After restart, Tlog failed to replay with following exception, and there is no data in solr.
      6245 [coreLoadExecutor-5-thread-1] ERROR org.apache.solr.update.UpdateLog û Failure to open existing log file (non fatal) E:\jeffery\src\apache\solr\4.10.2\solr-4.10.2\example\solr\collection1\data\t
      log\tlog.0000000000000000000:org.apache.solr.common.SolrException: java.io.EOFException
      at org.apache.solr.update.TransactionLog.<init>(TransactionLog.java:181)
      at org.apache.solr.update.UpdateLog.init(UpdateLog.java:261)
      at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:134)
      at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:94)
      at org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:100)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
      at java.lang.reflect.Constructor.newInstance(Unknown Source)
      at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:550)
      at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:620)
      at org.apache.solr.core.SolrCore.<init>(SolrCore.java:835)
      at org.apache.solr.core.SolrCore.<init>(SolrCore.java:646)
      at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
      at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
      at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
      at java.util.concurrent.FutureTask.run(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
      at java.lang.Thread.run(Unknown Source)
      Caused by: java.io.EOFException
      at org.apache.solr.common.util.FastInputStream.readUnsignedByte(FastInputStream.java:73)
      at org.apache.solr.common.util.FastInputStream.readInt(FastInputStream.java:216)
      at org.apache.solr.update.TransactionLog.readHeader(TransactionLog.java:268)
      at org.apache.solr.update.TransactionLog.<init>(TransactionLog.java:159)
      ... 19 more

      Check the code: seems this is related with: org.apache.solr.update.processor.RunUpdateProcessor, in processCommit, it sets changesSinceCommit=false(even we are using softCommit)

      So in finish, updateLog.finish will not be called.
      public void finish() throws IOException {
      if (changesSinceCommit && updateHandler.getUpdateLog() != null)

      { updateHandler.getUpdateLog().finish(null); }

      super.finish();
      }

      To fix this issue: I have to change RunUpdateProcessor.processCommit like below:
      if (!cmd.softCommit)

      { changesSinceCommit = false; }

        Activity

        Hide
        Xu Zhang added a comment -

        Probably this is not a bug. Finish() is about flush tlog into hard disk and soft-commit is just about visibility.

        Show
        Xu Zhang added a comment - Probably this is not a bug. Finish() is about flush tlog into hard disk and soft-commit is just about visibility.
        Hide
        Mark Miller added a comment -

        By default, the tlog doesnt fsync, it just flushes and leans on replicas. You can configure the sync level in solrconfig.xml.

        Show
        Mark Miller added a comment - By default, the tlog doesnt fsync, it just flushes and leans on replicas. You can configure the sync level in solrconfig.xml.
        Hide
        jefferyyuan added a comment - - edited

        The finish of UpdateProcessoris is always called in org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse).

        When we add a doc without softcommit or commit, the org.apache.solr.update.processor.RunUpdateProcessor.finish() will call getUpdateLog().finish() to fsync the tlog.

        But if add a doc with softcommit=true, RunUpdateProcessor.finish() will not call getUpdateLog().finish(), and will not fsync the tlog.

        This is kind of not right.

        User enables transaction log for data durability, to make sure there is no data lost.
        So I think it should always fsync the tlog after add this doc to solr and before the hard commit.

        Show
        jefferyyuan added a comment - - edited The finish of UpdateProcessoris is always called in org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(SolrQueryRequest, SolrQueryResponse). When we add a doc without softcommit or commit, the org.apache.solr.update.processor.RunUpdateProcessor.finish() will call getUpdateLog().finish() to fsync the tlog. But if add a doc with softcommit=true, RunUpdateProcessor.finish() will not call getUpdateLog().finish(), and will not fsync the tlog. This is kind of not right. User enables transaction log for data durability, to make sure there is no data lost. So I think it should always fsync the tlog after add this doc to solr and before the hard commit.
        Hide
        Yonik Seeley added a comment - - edited

        If this is reproducible by someone, it represents a bug.

        To fix this issue: I have to change RunUpdateProcessor.processCommit like below:
        if (!cmd.softCommit)

        Unknown macro: { changesSinceCommit = false; }

        It's not clear what this change is trying to fix (or why it changes anything for the reporter), but one should not have to softCommit (or commit) in order to not lose data.

        edit: I understand now (mis-read the patch the first time, missing the "!"). You are correct, this is a bug and the fix looks correct.

        Show
        Yonik Seeley added a comment - - edited If this is reproducible by someone, it represents a bug. To fix this issue: I have to change RunUpdateProcessor.processCommit like below: if (!cmd.softCommit) Unknown macro: { changesSinceCommit = false; } It's not clear what this change is trying to fix (or why it changes anything for the reporter), but one should not have to softCommit (or commit) in order to not lose data. edit: I understand now (mis-read the patch the first time, missing the "!"). You are correct, this is a bug and the fix looks correct.
        Hide
        jefferyyuan added a comment -

        Hi, Yonik:

        The problem here is that if we add a doc with softcommit=true(user wants the data to be immediately searchable), processCommit will set changesSinceCommit=false,
        then in finish method, it will not call updateHandler.getUpdateLog().finish(null) to fsync the tlog.

        If we kill the java after that, then data will be lost.

        Show
        jefferyyuan added a comment - Hi, Yonik: The problem here is that if we add a doc with softcommit=true(user wants the data to be immediately searchable), processCommit will set changesSinceCommit=false, then in finish method, it will not call updateHandler.getUpdateLog().finish(null) to fsync the tlog. If we kill the java after that, then data will be lost.
        Hide
        ASF subversion and git services added a comment -

        Commit 1642946 from Yonik Seeley in branch 'dev/trunk'
        [ https://svn.apache.org/r1642946 ]

        SOLR-6776: only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit

        Show
        ASF subversion and git services added a comment - Commit 1642946 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1642946 ] SOLR-6776 : only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit
        Hide
        ASF subversion and git services added a comment -

        Commit 1642950 from Yonik Seeley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1642950 ]

        SOLR-6776: only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit

        Show
        ASF subversion and git services added a comment - Commit 1642950 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1642950 ] SOLR-6776 : only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit
        Hide
        ASF subversion and git services added a comment -

        Commit 1642951 from Yonik Seeley in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1642951 ]

        SOLR-6776: only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit

        Show
        ASF subversion and git services added a comment - Commit 1642951 from Yonik Seeley in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1642951 ] SOLR-6776 : only clear changesSinceCommit on a hard commit so tlog will still be flushed on a softCommit
        Hide
        Yonik Seeley added a comment -

        Committed. Thanks!

        Show
        Yonik Seeley added a comment - Committed. Thanks!

          People

          • Assignee:
            Yonik Seeley
            Reporter:
            jefferyyuan
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development