[LUCENE-9096] Implementation of CompressingTermVectorsWriter.flushOffsets can be simpler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 8.2
Fix Version/s: 8.5
Component/s: core/codecs
Labels:
None

Lucene Fields:

New

Description

In CompressingTermVectorsWriter.flushOffsets, we count

sumPos and sumOffsets by the way

for (int i = 0; i < fd.numTerms; ++i) { 
  int previousPos = 0;
  int previousOff = 0;
  for (int j = 0; j < fd.freqs[i]; ++j) { 
    final int position = positionsBuf[fd.posStart + pos];
    final int startOffset = startOffsetsBuf[fd.offStart + pos];
    sumPos[fieldNumOff] += position - previousPos; 
    sumOffsets[fieldNumOff] += startOffset - previousOff; 
    previousPos = position;
    previousOff = startOffset;
    ++pos;
  }
}

we always use the position - previousPos, it can be summarized like this:

(position5-position4)+(position4-position3)+(position3-position2)+(position2-position1)

If we should simplify it: position5-position1

Attachments

Issue Links

links to

GitHub Pull Request #1125

Activity

People

Assignee:: Unassigned

Reporter:: kkewwei

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 17/Dec/19 03:30

Updated:: 28/Aug/22 15:54

Resolved:: 06/Jan/20 08:23

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

40m