[LUCENE-6653] Cleanup TermToBytesRefAttribute - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 5.3, 6.0
Component/s: modules/analysis
Labels:
None

Lucene Fields:

New

Description

While working on ~~LUCENE-6652~~, I figured out that there were so many test with wrongly implemented TermsToBytesRefAttribute. In addition, the whole concept back from Lucene 4.0 was no longer correct:

We don't return the hash code anymore; it is calculated by BytesRefHash
The interface is horrible to use. It tends to reuse the BytesRef instance but the whole thing is not correct.

Instead we should remove the fillBytesRef() method from the interface and let getBytesRef() populate and return the BytesRef. It does not matter if the attribute reuses the BytesRef or returns a new one. It just get consumed like a standard CharTermAttribute. You get a BytesRef and can use it until you call incrementToken().

As the TermsToBytesRefAttribute is marked experimental, I see no reason why we should not change the semantics to be more easy to understand and behave like all other attributes. I will add a note to the backwards incompatible changes in Lucene 5.3.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-6653.patch
01/Jul/15 23:08
68 kB
Uwe Schindler

Issue Links

incorporates

LUCENE-6652 Remove tons of BytesRefAttribute/BytesRefAttributeImpl duplicates in tests

Closed

Activity

People

Assignee:: Uwe Schindler

Reporter:: Uwe Schindler

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 01/Jul/15 21:27

Updated:: 28/Aug/22 14:38

Resolved:: 02/Jul/15 15:09