Issue Details (XML | Word | Printable)

Key: LUCENE-1262
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Trejkaz
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

IndexOutOfBoundsException from FieldsReader after problem reading the index

Created: 09/Apr/08 12:23 AM   Updated: 08/May/08 07:47 PM
Return to search
Component/s: Index
Affects Version/s: 2.3.1
Fix Version/s: 2.3.2, 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-1262.patch 2008-04-11 10:20 AM Michael McCandless 6 kB
Java Source File Test.java 2008-04-10 12:31 AM Trejkaz 2 kB

Lucene Fields: New
Resolution Date: 13/Apr/08 11:27 PM


 Description  « Hide
There is a situation where there is an IOException reading from Hits, and then the next time you get a NullPointerException instead of an IOException.

Example stack traces:

java.io.IOException: The specified network name is no longer available
at java.io.RandomAccessFile.readBytes(Native Method)
at java.io.RandomAccessFile.read(RandomAccessFile.java:322)
at org.apache.lucene.store.FSIndexInput.readInternal(FSDirectory.java:536)
at org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:74)
at org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:220)
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:93)
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:34)
at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:57)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:88)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:344)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:368)
at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:84)
at org.apache.lucene.search.Hits.doc(Hits.java:104)

That error is fine. The problem is the next call to doc generates:

java.lang.NullPointerException
at org.apache.lucene.index.FieldsReader.getIndexType(FieldsReader.java:280)
at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:216)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:101)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:344)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:368)
at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:84)
at org.apache.lucene.search.Hits.doc(Hits.java:104)

Presumably FieldsReader is caching partially-initialised data somewhere. I would normally expect the exact same IOException to be thrown for subsequent calls to the method.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Michael McCandless added a comment - 09/Apr/08 09:51 AM
Those stack traces look like 2.1 not 2.3.1. Is that right?

Can you post the index that you are using and the code that results in the 2nd exception? I can't get the 2nd exception to happen in a test case...


Trejkaz added a comment - 09/Apr/08 11:08 PM
Whoops. I don't think it's 2.1 but it must be 2.2.

I'll try and reproduce this standalone but first I need a way to have readInternal throw an exception. I presume you were using some kind of custom store implementation to do that, I'll see if I can make it happen.under 2.2 and then try the same thing under 2.3.1 to confirm whether it still breaks.


Trejkaz added a comment - 10/Apr/08 12:01 AM
Okay I'll eat my words now, it is indeed 2.1 as the version doesn't have openInput(String,int) in it.

Anyway an update: I've managed to reproduce it on any text index by simulating random network outage. I'm keeping a flag which I set to true. The trick is that the wrapping IndexInput implementation randomly throws IOException if the flag is true – if it always throws IOException the problem doesn't occur. If it randomly throws it then it occurs occasionally, and it always seems to be for larger queries (I'm using MatchAllDocsQuery now.)

I'll see if I can tweak the code to make it more likely to happen and then start working up to each version of Lucene to see if it stops happening somewhere.


Trejkaz added a comment - 10/Apr/08 12:29 AM
I managed to reproduce the problem as-is under version 2.2.

For 2.3 the problem has changed – instead of a NullPointerException it is now an IndexOutOfBoundsException:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 52, Size: 34
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:154)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:92)
at org.apache.lucene.search.Hits.doc(Hits.java:167)
at Test.main(Test.java:24)

Will attach my test program in a moment.


Trejkaz added a comment - 10/Apr/08 12:31 AM
Attaching a test program to reproduce the problem under 2.3.1.

It occurs approximately 1 in every 4 executions for any reasonably large text index (really small ones don't seem to do it so I couldn't attach a text index with it.) The number of fields may be related, looking at the IndexOutOfBoundsException numbers it seems that the indexes we have happen to have a large number of fields.


Michael McCandless added a comment - 11/Apr/08 09:21 AM
OK indeed I can get the failure to happen, using your Test running against a partial Wikipedia index I have. I'll pursue! Thanks Trejkaz.

Michael McCandless added a comment - 11/Apr/08 10:20 AM
Attached patch. All tests pass. I plan to commit in a day or so, to
both trunk (2.4) and 2.3.X branch (2.3.2).

I got the failure to happen with a standalone test case, added to
TestFieldsReader.

I found & fixed the issue. It's in BufferedIndexReader's refill()
method. The problem is that method changes bufferLength even if an
exception is hit. This leaves incorrect bytes in the buffer such that
a subsequent readByte will return the incorrect bytes.

The fix is simple: use a local "int newLength" and only assign that to
value to bufferLength if the readInternal() call succeeds. The test
fails without the fix and passes with it.


Michael McCandless added a comment - 13/Apr/08 11:27 PM
I just committed this. Thanks Trejkaz!