[LUCENE-4547] DocValues field broken on large indexes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.2, 6.0
Component/s: None
Labels:
None

Lucene Fields:

New

Description

I tried to write a test to sanity check ~~LUCENE-4536~~ (first running against svn revision 1406416, before the change).

But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field:

final int numDocs = 500000000;
for (int i = 0; i < numDocs; ++i) {
  if (i == 0) {
    field.setLongValue(0L); // force > 32bit deltas
  } else {
    field.setLongValue(1<<33L); 
  }
  w.addDocument(doc);
}
w.forceMerge(1);
w.close();
dir.close(); // checkindex

[junit4:junit4]   2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues]
[junit4:junit4]   2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536
[junit4:junit4]   2> 	at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0)
[junit4:junit4]   2> 	at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535)
[junit4:junit4]   2> 	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508)
[junit4:junit4]   2> Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536
[junit4:junit4]   2> 	at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305)
[junit4:junit4]   2> 	at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115)
[junit4:junit4]   2> 	at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109)
[junit4:junit4]   2> 	at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80)
[junit4:junit4]   2> 	at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130)
[junit4:junit4]   2> 	at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

test.patch
07/Nov/12 14:54
3 kB
Robert Muir
LUCENE-4547.patch
06/Feb/13 01:30
1.86 MB
Robert Muir

Issue Links

blocks

SOLR-3855 DocValues support

Closed

breaks

SOLR-5231 When a boolean field is missing from a doc it is sometimes treated as "true" by the "if" function (based on other docs in segment?)

Closed

incorporates

LUCENE-4033 Sort api problems

Closed

LUCENE-4717 Lucene40's DocValues (sometimes?) have a bogus extra ordinal

Closed

LUCENE-3729 Allow using FST to hold terms data in DocValues.BYTES_*_SORTED

Closed

LUCENE-4035 Collation via docvalues

Closed

LUCENE-4540 Allow packed ints norms

Closed

(2 incorporates)

Activity

People

Assignee:: Unassigned

Reporter:: Robert Muir

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 07/Nov/12 14:52

Updated:: 28/Aug/22 13:32

Resolved:: 08/Feb/13 03:30