|
[
Permlink
| « Hide
]
Nadav Har'El added a comment - 24/Oct/06 09:52 AM
The patch, which includes the change to BufferedIndexInput.readBytes(), and a new unit test for that class.
> I wonder why this happened.
readBytes on less than a buffer size probably only happens with binary (or compressed) fields, relatively new additions to Lucene, so it probably didn't have much of a real-world impact. I think it is important to fix though, as more things may be byte-oriented in the future. After applying the patch, at least one unit test fails: [junit] Testcase: testReadPastEOF(org.apache.lucene.index.TestCompoundFile): Sorry, I didn't notice that my fix broke this unit test. Thanks for catching that.
What is happening is interesting: this test (TestCompoundFile.testReadPastEof()) is testing what happens when you read 40 bytes beyond the end of file, and expects the appropriate exception to be thrown. The old code actually did this for 40 bytes, so it passed this test, but the interesting thing is that when you asked for more than a buffer-full of bytes, say, 10K, the length() checking code was not there! So the old code was broken in this respect, just not for 40 bytes which were tested. I'll fix my patch to add this beyond-end-of-file check, and will post the new patch ASAP. A fixed patch, which now checks that we don't read past of of file. This is now checked correctly in all three cases (1. data already in the buffer, 2. small number of bytes in addition to buffer 3. large number of bytes in addition to the buffer).
Note that the original code (before my patch) did not check length() for large number of bytes, only in refill() (which was only called for a small number of bytes). This code now checks in this case as well, so it is more correct than it was. The TestCompoundFile test now passes, and I also added to my new BufferedIndexInput unit test a third test case, testEOF, which tests that we can read up to EOF, but not past it. This test tests that small overflows (a few bytes) and very large overflows both throw an exception. I also made another change in this patch which I wish I didn't have to make, to account for other unit tests: One unit test assumed that readBytes() can work if given a null array, if the length requested is 0. Unfortunately, System.arraycopy doesn't share this permiscousity, so I had to add another silly if(len>0) test in the readBytes() code. > One unit test assumed that readBytes() can work if given a null array, if the length requested is 0. Unfortunately,
> System.arraycopy doesn't share this permiscousity, so I had to add another silly if(len>0) test in the readBytes() > code. If "given" a null array? Is this ever done in Lucene? Which should be fixed, the testcase or the code? > If "given" a null array? Is this ever done in Lucene? Which should be fixed, the testcase or the code?
I don't know - readBytes() documentation doesn't explictly say what should happen if it is asked to read zero bytes: is it simply supposed to do nothing (and in this case it doesn't matter which array you give it - could even be null), or should it still expect the array to be non-null? I don't know if any code in Lucene itself assumes that it can work when given a null array and a 0 count - I doubt it. But one test does assume this, so I simply added an extra "if" to check for the 0 count, and when that happens, avoid calling System.arraycopy() (which even when the count is 0, expects the array to be non-null, for some reason). |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||