Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8458

Carry-over hard-deletes after merge may not adjust soft-delete count

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.5, 8.0
    • Fix Version/s: 7.5, 8.0
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Attached is a test that can trip PendingDeletetes assertion around 5%.

      The assertion is violated because we do not reduce soft-deletes count accordingly when carrying over hard-deletes after a merge in IndexWriter#carryOverHardDeletes. If the newly merged segment has soft-deleted documents, its PendingDeletes requires a segment reader to "transfer" soft-deletes count to hard-deletes accordingly.

      testSoftDeleteWhileMergeSurvives (introduced in LUCENE-8293) always passes as a segment warmer used in that test forces ReadersAndUpdates to open a new segment reader.

      NOTE: reproduce with: ant test  -Dtestcase=TestSoftDeletesRetentionMergePolicy -Dtests.method=testMixedSoftDeletesAndHardDeletes -Dtests.seed=FFED48B49B9F6AA5 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=zh-Hans -Dtests.timezone=Atlantic/South_Georgia -Dtests.asserts=true -Dtests.file.encoding=UTF-8
      8月 19, 2018 12:11:10 上午 com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
      警告: Uncaught exception in thread: Thread[Lucene Merge Thread #0,5,TGRP-TestSoftDeletesRetentionMergePolicy]
      org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError: softDeleteCount doesn't match 21 != 19
      	at __randomizedtesting.SeedInfo.seed([FFED48B49B9F6AA5]:0)
      	at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:704)
      	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
      Caused by: java.lang.AssertionError: softDeleteCount doesn't match 21 != 19
      	at org.apache.lucene.index.PendingSoftDeletes.onNewReader(PendingSoftDeletes.java:87)
      	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:173)
      	at org.apache.lucene.index.ReadersAndUpdates.getLatestReader(ReadersAndUpdates.java:237)
      	at org.apache.lucene.index.PendingSoftDeletes.ensureInitialized(PendingSoftDeletes.java:189)
      	at org.apache.lucene.index.PendingSoftDeletes.isFullyDeleted(PendingSoftDeletes.java:200)
      	at org.apache.lucene.index.ReadersAndUpdates.isFullyDeleted(ReadersAndUpdates.java:744)
      	at org.apache.lucene.index.IndexWriter.isFullyDeleted(IndexWriter.java:5161)
      	at org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3926)
      	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4592)
      	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4058)
      	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
      	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
      
      NOTE: reproduce with: ant test  -Dtestcase=TestSoftDeletesRetentionMergePolicy -Dtests.method=testMixedSoftDeletesAndHardDeletes -Dtests.seed=FFED48B49B9F6AA5 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=zh-Hans -Dtests.timezone=Atlantic/South_Georgia -Dtests.asserts=true -Dtests.file.encoding=UTF-8
      
      com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=16, name=Lucene Merge Thread #0, state=RUNNABLE, group=TGRP-TestSoftDeletesRetentionMergePolicy]
      
      	at __randomizedtesting.SeedInfo.seed([FFED48B49B9F6AA5:B2667DCCD81812E2]:0)
      Caused by: org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError: softDeleteCount doesn't match 21 != 19
      	at __randomizedtesting.SeedInfo.seed([FFED48B49B9F6AA5]:0)
      	at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:704)
      	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
      Caused by: java.lang.AssertionError: softDeleteCount doesn't match 21 != 19
      	at org.apache.lucene.index.PendingSoftDeletes.onNewReader(PendingSoftDeletes.java:87)
      	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:173)
      	at org.apache.lucene.index.ReadersAndUpdates.getLatestReader(ReadersAndUpdates.java:237)
      	at org.apache.lucene.index.PendingSoftDeletes.ensureInitialized(PendingSoftDeletes.java:189)
      	at org.apache.lucene.index.PendingSoftDeletes.isFullyDeleted(PendingSoftDeletes.java:200)
      	at org.apache.lucene.index.ReadersAndUpdates.isFullyDeleted(ReadersAndUpdates.java:744)
      	at org.apache.lucene.index.IndexWriter.isFullyDeleted(IndexWriter.java:5161)
      	at org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3926)
      	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4592)
      	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4058)
      	at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
      	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
      

       

        Attachments

        1. test.patch
          7 kB
          Nhat Nguyen
        2. LUCENE-8458.patch
          16 kB
          Nhat Nguyen
        3. LUCENE-8458.patch
          18 kB
          Nhat Nguyen
        4. LUCENE-8144.patch
          11 kB
          Nhat Nguyen
        5. LUCENE-8144.patch
          12 kB
          Nhat Nguyen

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                dnhatn Nhat Nguyen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m