Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8255

Can we make index sorting work for soft deletes

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      I phrased this as a question since it's mainly a discussion. I spoke to rcmuir on a couple of occasions about making index sorting work for soft deletes. The issue that prevents this is that soft deletes use updateable DV to mark docs as deleted. This basically means that a sorted segment is not guaranteed to be sorted if it has received any updates. This also means that sorting such a segment on merge has a significant overhead. (I hope jimczi can shed some light on it how much we would have to expect). We also need to add some special casing since we use "merge sorting" and can't go backwards in doc ID which would be violated if a segment received updates. (cc jpountz)

      The main purpose of doing this is that "soft deleted" documents would either be at the end or in the beginning of the segment such that compression is better if these docs have larger retention policies. 

      Attachments

        Activity

          People

            Unassigned Unassigned
            simonw Simon Willnauer
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: