Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
-
New
Description
I phrased this as a question since it's mainly a discussion. I spoke to rcmuir on a couple of occasions about making index sorting work for soft deletes. The issue that prevents this is that soft deletes use updateable DV to mark docs as deleted. This basically means that a sorted segment is not guaranteed to be sorted if it has received any updates. This also means that sorting such a segment on merge has a significant overhead. (I hope jimczi can shed some light on it how much we would have to expect). We also need to add some special casing since we use "merge sorting" and can't go backwards in doc ID which would be violated if a segment received updates. (cc jpountz)
The main purpose of doing this is that "soft deleted" documents would either be at the end or in the beginning of the segment such that compression is better if these docs have larger retention policies.