if you look at the current logic, it tries to optimize this, but doesnt have any safety switch. it also doesn't always work: it sometimes falls back to the default algorithm (even when there are no deletions). This is because of this conditional logic in trunk, comments mine:
if (docBase + chunkDocs < maxDoc) ...
When there are deletions it will always fall back and never resync, so handling that case currently doesn't achieve anything except complexity.
The falling out of sync it does impacts smaller vectors docs more because default algorithm is slow (no getMergeInstance). For big documents (size > chunkSize), you also dodge the resync problem above.
I replaced it with the same algorithm from stored fields.
If turned on vectors (5 fields), indexing, and store with small documents (10 fields):
SM 0 [2015-01-17 03:26:30.208; main]: 3765 msec to merge vectors [9490360 docs]
SM 0 [2015-01-17 03:26:04.928; main]: 5508 msec to merge vectors [7300730 docs]
SM 0 [2015-01-17 03:25:43.179; main]: 1261 msec to merge vectors [189430 docs]
SM 0 [2015-01-17 03:37:15.480; main]: 2183 msec to merge vectors [9490360 docs]
SM 0 [2015-01-17 03:36:50.698; main]: 1492 msec to merge vectors [7300730 docs]
SM 0 [2015-01-17 03:36:32.620; main]: 27 msec to merge vectors [189430 docs]
You can see the forceMerge time is not so much better, because 2/3 of the collection is in the first segment, but overall indexing is improved because it impacts other merges (see the 189430 one)
Anyway I think its better overall, at least for simplicity and additional safety mechanism.