[LUCENE-2426] change sort order to binary order - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1
Fix Version/s: 4.0-ALPHA
Component/s: core/index
Labels:
None

Lucene Fields:

New

Description

Since flexible indexing, terms are now represented as byte[], but for backwards compatibility reasons, they are not sorted as byte[], but instead as if they were char[].

I think its time to look at sorting terms as byte[]... this would yield the following improvements:

terms are more opaque by default, they are byte[] and sort as byte[]. I think this would make lucene friendlier to customizations.
numerics and collation are then free to use their own encoding (full byte) rather than avoiding the use of certain bits to remain compatible with char[] sort order.
automaton gets simpler because as in ~~LUCENE-2265~~, it uses byte[] too, and has special hacks because terms are sorted as char[]

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-2426_automaton.patch
18/Jun/10 16:05
7 kB
Robert Muir
LUCENE-2426.patch
20/Jun/10 23:11
71 kB
Michael McCandless
LUCENE-2426.patch
18/Jun/10 15:48
47 kB
Michael McCandless

Issue Links

depends upon

LUCENE-2380 Add FieldCache.getTermBytes, to load term data as byte[]

Closed

LUCENE-2442 Remove flex back compat layers & pre-flex APIs

Closed

LUCENE-2265 improve automaton performance by running on byte[]

Closed

is blocked by

LUCENE-2378 Cutover remaining usage of pre-flex APIs

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Robert Muir

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 02/May/10 14:19

Updated:: 28/Aug/22 12:25

Resolved:: 24/Jun/10 13:36