Lucene - Core
  1. Lucene - Core
  2. LUCENE-2280

IndexWriter.optimize() throws NullPointerException

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.3.2
    • Fix Version/s: None
    • Component/s: core/index
    • Environment:

      Win 2003, lucene version 2.3.2, IBM JRE 1.6

    • Lucene Fields:
      New

      Description

      I am using lucene 2.3.2 search APIs for my application, i am indexing 45GB database which creates approax 200MB index file, after finishing the indexing and while running optimize() i can see NullPointerExcception thrown in my log and index file is getting corrupted, log says

      ------------------------------------------------------------------------
      Caused by:
      java.lang.NullPointerException
      at org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:49)
      at org.apache.lucene.store.IndexOutput.writeBytes(IndexOutput.java:40)
      at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:566)
      at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:135)
      at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3273)
      at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2968)
      at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
      ------------------------------------------------------------------------
      and this is happening quite frequently, although I am not able to reproduce it on demand, I saw an issue logged which is some what related to mine issue (http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200809.mbox/%3C6E4A40DB-5EFC-42DA-A857-D59F4EC3496B@mikemccandless.com%3E) but the only difference here is I am not using Store.Compress for my fields, i am using Store.NO instead. please note that I am using IBM JRE for my application.

      Is this an issue with lucene?, if yes it is fixed in which version?

      1. lucene.jar
        650 kB
        Ritesh Nigam
      2. lucene.zip
        3.75 MB
        Ritesh Nigam
      3. LuceneUtils.zip
        2.40 MB
        Ritesh Nigam

        Activity

        Hide
        Michael McCandless added a comment -

        Are you sure you're using a stock version 2.3.2 of Lucene?

        I ask because... the line numbers in SegmentMerger (specifically 566) don't correlate to 2.3.2. The other line numbers do match. It's odd.

        But looking at the code I don't see how either of the arrays being passed to System.arraycopy can be null.

        Can you turn on IndexWriter's infoStream and capture & post the output?

        It's also strange that this leads to index corruption; it shouldn't (the merge should just fail, and the index should be untouched). Can you run CheckIndex on the index and post what corruption it uncovers.

        Does this happen in a Sun JRE?

        Show
        Michael McCandless added a comment - Are you sure you're using a stock version 2.3.2 of Lucene? I ask because... the line numbers in SegmentMerger (specifically 566) don't correlate to 2.3.2. The other line numbers do match. It's odd. But looking at the code I don't see how either of the arrays being passed to System.arraycopy can be null. Can you turn on IndexWriter's infoStream and capture & post the output? It's also strange that this leads to index corruption; it shouldn't (the merge should just fail, and the index should be untouched). Can you run CheckIndex on the index and post what corruption it uncovers. Does this happen in a Sun JRE?
        Hide
        Ritesh Nigam added a comment -

        Are you sure you're using a stock version 2.3.2 of Lucene?

        • Yes, I checked the manifest of the jar.

        I ask because... the line numbers in SegmentMerger (specifically 566) don't correlate to 2.3.2. The other line numbers do match. It's odd.

        But looking at the code I don't see how either of the arrays being passed to System.arraycopy can be null.

        Can you turn on IndexWriter's infoStream and capture & post the output?

        • I have turned on the infostream for IndexWriter, it will take some time to get the result. once I get the result I will post that.

        It's also strange that this leads to index corruption; it shouldn't (the merge should just fail, and the index should be untouched). Can you run CheckIndex on the index and post what corruption it uncovers.

        • Here index corruption I mean that the main index file is getting deleted and search is not returning expected result. Hence there is no index file exists after the NullPointerExcepton, I cannot run CheckIndex.

        Does this happen in a Sun JRE?

        • I have not yet tested the same scenario on Sun JRE till now.
        Show
        Ritesh Nigam added a comment - Are you sure you're using a stock version 2.3.2 of Lucene? Yes, I checked the manifest of the jar. I ask because... the line numbers in SegmentMerger (specifically 566) don't correlate to 2.3.2. The other line numbers do match. It's odd. But looking at the code I don't see how either of the arrays being passed to System.arraycopy can be null. Can you turn on IndexWriter's infoStream and capture & post the output? I have turned on the infostream for IndexWriter, it will take some time to get the result. once I get the result I will post that. It's also strange that this leads to index corruption; it shouldn't (the merge should just fail, and the index should be untouched). Can you run CheckIndex on the index and post what corruption it uncovers. Here index corruption I mean that the main index file is getting deleted and search is not returning expected result. Hence there is no index file exists after the NullPointerExcepton, I cannot run CheckIndex. Does this happen in a Sun JRE? I have not yet tested the same scenario on Sun JRE till now.
        Hide
        Ritesh Nigam added a comment -

        Attaching the lucene.jar which i am using for my application.

        Show
        Ritesh Nigam added a comment - Attaching the lucene.jar which i am using for my application.
        Hide
        Ritesh Nigam added a comment -

        lucene.jar my application is using

        Show
        Ritesh Nigam added a comment - lucene.jar my application is using
        Hide
        Michael McCandless added a comment -

        Indeed that JAR is identical to 2.3.2. Weird. Not sure why the line number doesn't line up. Irks me.

        Here index corruption I mean that the main index file is getting deleted and search is not returning expected result. Hence there is no index file exists after the NullPointerExcepton, I cannot run CheckIndex.

        That's even stranger – nothing should get deleted because a merge fails. Is it possible your app has an exception handler doing this? Or maybe this is a brand new index, and it doesn't get properly closed (ie, no commit) when this exception is hit? If not... can you provide more details? An exception like this should have no impact on the original index.

        Please post the infoStream output when you get it, and report back whether this happens on Sun's JVM. But I still can't see how either of the arrays could be null here... this is a weird one.

        Are you using the latest updates to the IBM 1.6 JRE?

        Show
        Michael McCandless added a comment - Indeed that JAR is identical to 2.3.2. Weird. Not sure why the line number doesn't line up. Irks me. Here index corruption I mean that the main index file is getting deleted and search is not returning expected result. Hence there is no index file exists after the NullPointerExcepton, I cannot run CheckIndex. That's even stranger – nothing should get deleted because a merge fails. Is it possible your app has an exception handler doing this? Or maybe this is a brand new index, and it doesn't get properly closed (ie, no commit) when this exception is hit? If not... can you provide more details? An exception like this should have no impact on the original index. Please post the infoStream output when you get it, and report back whether this happens on Sun's JVM. But I still can't see how either of the arrays could be null here... this is a weird one. Are you using the latest updates to the IBM 1.6 JRE?
        Hide
        Ritesh Nigam added a comment -

        Sorry for the delayed response but i was trying to collect some data, I still have to wait for the log created by lucene APIs after turning on infostream, i will post that log file as soon as i get it. To answer your questions, i am running on Windows 2003 server with IBM JRE 1.6 and I am trying to index a database that is about 45 GB in size. On Unix, i am not facing this issue.

        With the lucene logging on, i waited for about 2 days for the NPE to reoccur, but it did not happen. I was tyring with the stock jar file of lucene 2.3.2. However, indexing failed again and the index file was again deleted. Looking into the logs this time, I found the following exceptions:

        16:06:48 03/01/2010: Fatal error while indexing CRs.
        Stack trace: java.io.FileNotFoundException: index_21f.fdt (The system cannot find the file specified.)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:218)
        at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506)
        at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
        at org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:206)
        at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:173)
        at org.apache.lucene.index.DocumentsWriter.createCompoundFile(DocumentsWriter.java:576)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2708)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2539)
        at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1222)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1196)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1164)

        I am not sure if deletion of the index file when these exceptions are thrown, is the right behaviour.

        I am re-running my test to try and reproduce the NPE issue which i had reported earlier and i intend to pass on the logs to you as soon as i have it. However, are you aware of any issue that a combination of IBM JRE + Windows 2003 server would have with lucene?

        As far as the code is concern I am closing the indexwriter in finally block, when is thrown I am logging it in catch block and then finally closing the indexwriter.

        Show
        Ritesh Nigam added a comment - Sorry for the delayed response but i was trying to collect some data, I still have to wait for the log created by lucene APIs after turning on infostream, i will post that log file as soon as i get it. To answer your questions, i am running on Windows 2003 server with IBM JRE 1.6 and I am trying to index a database that is about 45 GB in size. On Unix, i am not facing this issue. With the lucene logging on, i waited for about 2 days for the NPE to reoccur, but it did not happen. I was tyring with the stock jar file of lucene 2.3.2. However, indexing failed again and the index file was again deleted. Looking into the logs this time, I found the following exceptions: 16:06:48 03/01/2010: Fatal error while indexing CRs. Stack trace: java.io.FileNotFoundException: index_21f.fdt (The system cannot find the file specified.) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(RandomAccessFile.java:218) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440) at org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:206) at org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:173) at org.apache.lucene.index.DocumentsWriter.createCompoundFile(DocumentsWriter.java:576) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2708) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2539) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1222) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1196) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1164) I am not sure if deletion of the index file when these exceptions are thrown, is the right behaviour. I am re-running my test to try and reproduce the NPE issue which i had reported earlier and i intend to pass on the logs to you as soon as i have it. However, are you aware of any issue that a combination of IBM JRE + Windows 2003 server would have with lucene? As far as the code is concern I am closing the indexwriter in finally block, when is thrown I am logging it in catch block and then finally closing the indexwriter.
        Hide
        Michael McCandless added a comment -

        However, indexing failed again and the index file was again deleted.

        Are you calling .commit() periodically? It's odd, otherwise, that you have no index on hitting any problem. (It's not necessary to call commit until the very end, if you don't need readers to see the ongoing changes, don't need the durability, etc.).

        I am not sure if deletion of the index file when these exceptions are thrown, is the right behaviour.

        Stack trace: java.io.FileNotFoundException: index_21f.fdt (The system cannot find the file specified.)

        Lucene isn't "intentionally" deleting files that are referenced by the commit. But if you're not committing during the whole IW session, then you'd see no index at the end.

        But, this FNFE is not expected in any event. Seeing the full infoStream leading to this may help...

        That FNFE is happening as IW tries to build the CFS of the segment flushed during close.

        Are you using autoCommit=true or false?

        The fact that this only happens on Windows is telling.... is there a virus checker running on this machine?

        It'd also be good to know if this still happens on newer versions of Lucene.

        Show
        Michael McCandless added a comment - However, indexing failed again and the index file was again deleted. Are you calling .commit() periodically? It's odd, otherwise, that you have no index on hitting any problem. (It's not necessary to call commit until the very end, if you don't need readers to see the ongoing changes, don't need the durability, etc.). I am not sure if deletion of the index file when these exceptions are thrown, is the right behaviour. Stack trace: java.io.FileNotFoundException: index_21f.fdt (The system cannot find the file specified.) Lucene isn't "intentionally" deleting files that are referenced by the commit. But if you're not committing during the whole IW session, then you'd see no index at the end. But, this FNFE is not expected in any event. Seeing the full infoStream leading to this may help... That FNFE is happening as IW tries to build the CFS of the segment flushed during close. Are you using autoCommit=true or false? The fact that this only happens on Windows is telling.... is there a virus checker running on this machine? It'd also be good to know if this still happens on newer versions of Lucene.
        Hide
        Ritesh Nigam added a comment -

        I checked the documentation of IndexWriter in 2.3.2, API commit() is not available with this version (I think it is introduced in 2.4), I am not explicitely setting autoCommit, so it should take default value which I believe is "true".

        One more thing I am catching any exception hitting during indexing or optimizing, and then in finally block i am closing the IndexWriter by calling close(), method which sould take care of commit internally? Please suggest me if there is any equivalent method which i can use in place of commit()

        I have not upgraded to the newer version of lucene, but probably i will try 3.0.0 version of lucene in future.

        Show
        Ritesh Nigam added a comment - I checked the documentation of IndexWriter in 2.3.2, API commit() is not available with this version (I think it is introduced in 2.4), I am not explicitely setting autoCommit, so it should take default value which I believe is "true". One more thing I am catching any exception hitting during indexing or optimizing, and then in finally block i am closing the IndexWriter by calling close(), method which sould take care of commit internally? Please suggest me if there is any equivalent method which i can use in place of commit() I have not upgraded to the newer version of lucene, but probably i will try 3.0.0 version of lucene in future.
        Hide
        Michael McCandless added a comment -

        I checked the documentation of IndexWriter in 2.3.2, API commit() is not available with this version (I think it is introduced in 2.4), I am not explicitely setting autoCommit, so it should take default value which I believe is "true".

        Ahh right sorry. Hard to remember that far back!

        One more thing I am catching any exception hitting during indexing or optimizing, and then in finally block i am closing the IndexWriter by calling close(), method which sould take care of commit internally? Please suggest me if there is any equivalent method which i can use in place of commit()

        That's right, though, Lucene doesn't call "fsync" (to the OS) in
        2.3.x... so if the machine/OS crashes it could corrupt your index. So
        it's not quite doing the same thing as commit() does in newer Lucene
        releases (ensure durability), but it does save everything to the
        index.

        I have not upgraded to the newer version of lucene, but probably i will try 3.0.0 version of lucene in future.

        OK.

        Show
        Michael McCandless added a comment - I checked the documentation of IndexWriter in 2.3.2, API commit() is not available with this version (I think it is introduced in 2.4), I am not explicitely setting autoCommit, so it should take default value which I believe is "true". Ahh right sorry. Hard to remember that far back! One more thing I am catching any exception hitting during indexing or optimizing, and then in finally block i am closing the IndexWriter by calling close(), method which sould take care of commit internally? Please suggest me if there is any equivalent method which i can use in place of commit() That's right, though, Lucene doesn't call "fsync" (to the OS) in 2.3.x... so if the machine/OS crashes it could corrupt your index. So it's not quite doing the same thing as commit() does in newer Lucene releases (ensure durability), but it does save everything to the index. I have not upgraded to the newer version of lucene, but probably i will try 3.0.0 version of lucene in future. OK.
        Hide
        Ritesh Nigam added a comment -

        Yesterday again search indxer crashed for my application and index file got deleted, this time had turned on the infostream on for indexwriter, attaching the infostream log file.

        Show
        Ritesh Nigam added a comment - Yesterday again search indxer crashed for my application and index file got deleted, this time had turned on the infostream on for indexwriter, attaching the infostream log file.
        Hide
        Ritesh Nigam added a comment -

        Lucene infostream log file.

        Show
        Ritesh Nigam added a comment - Lucene infostream log file.
        Hide
        Ritesh Nigam added a comment -

        I installed a test setup with lucene 3.0.0 and tried to reproduce the scenario with NPE, but after the Exception thrown, main index file is not getting deleted but only optimize is failing and i can see some small index file (.cfs) also along with main index file, and one more thing here is i am not using commit yet, but using close(), does close do the same thing as commit does?

        By looking at above behavior, is there a bug in 2.3.2 version where this kind of situaion is not handled properly?

        Can you please have a look at the log which i got after turning on the infostream for IndexWriter(for lucene 2.3.2). Attached as lucene.zip.

        Show
        Ritesh Nigam added a comment - I installed a test setup with lucene 3.0.0 and tried to reproduce the scenario with NPE, but after the Exception thrown, main index file is not getting deleted but only optimize is failing and i can see some small index file (.cfs) also along with main index file, and one more thing here is i am not using commit yet, but using close(), does close do the same thing as commit does? By looking at above behavior, is there a bug in 2.3.2 version where this kind of situaion is not handled properly? Can you please have a look at the log which i got after turning on the infostream for IndexWriter(for lucene 2.3.2). Attached as lucene.zip.
        Hide
        Michael McCandless added a comment -

        From the log I can see that you run fine for a long time, opening IW,
        indexing a few docs, optimizing, then closing. Then suddenly the
        exceptions start happening on many (but not all) merges, and, merges
        involving different segments. JRE bug seems most likely I guess...

        Since you see this only on Windows (not eg on Linux), I think this is
        likely not a bug in Lucene but rather something particular about your
        Windows env – virus checker maybe? Is there anything in the Windows
        events log that correlate to when the exceptions start?

        Or it could be a JRE bug – you really should try on different (Sun)
        JRE.

        Show
        Michael McCandless added a comment - From the log I can see that you run fine for a long time, opening IW, indexing a few docs, optimizing, then closing. Then suddenly the exceptions start happening on many (but not all) merges, and, merges involving different segments. JRE bug seems most likely I guess... Since you see this only on Windows (not eg on Linux), I think this is likely not a bug in Lucene but rather something particular about your Windows env – virus checker maybe? Is there anything in the Windows events log that correlate to when the exceptions start? Or it could be a JRE bug – you really should try on different (Sun) JRE.
        Hide
        Michael McCandless added a comment -

        Yes close() does commit() internally.

        Are you saying you see the same exception on 3.0, using the IBM JRE? Can you try with the Sun JRE?

        Show
        Michael McCandless added a comment - Yes close() does commit() internally. Are you saying you see the same exception on 3.0, using the IBM JRE? Can you try with the Sun JRE?
        Hide
        Ritesh Nigam added a comment -

        I will further investigate it if it is a JRE bug, There are few points which i need some advise
        1. To fix this issue if i disable the optimize (remove the call to IndexWriter.optimize() from my code) will that create any problem in the long run? if yes what kind of problems it may create?
        2. Just by knowing the scenario, that after a NullPointerException, index file gets deleted, can you provide me a patch where any kind of exception is handled by the Lucene API and my index remains untouched?

        Show
        Ritesh Nigam added a comment - I will further investigate it if it is a JRE bug, There are few points which i need some advise 1. To fix this issue if i disable the optimize (remove the call to IndexWriter.optimize() from my code) will that create any problem in the long run? if yes what kind of problems it may create? 2. Just by knowing the scenario, that after a NullPointerException, index file gets deleted, can you provide me a patch where any kind of exception is handled by the Lucene API and my index remains untouched?
        Hide
        Michael McCandless added a comment -

        1. To fix this issue if i disable the optimize (remove the call to IndexWriter.optimize() from my code) will that create any problem in the long run? if yes what kind of problems it may create?

        Most apps never need to optimize.

        You should only optimize if your search performance is not good enough, and really, before using optimize you should explore other ideas (see ImproveSearchingSpeed on the wiki).

        2. Just by knowing the scenario, that after a NullPointerException, index file gets deleted, can you provide me a patch where any kind of exception is handled by the Lucene API and my index remains untouched?

        Can't make a patch here until we understand what's actually happening...

        It looks like your index got truncated at IW 443. Hmm actually it looks like IW 442 never successfully closed. (do a grep for ": at close:" and you'll see it's missing).

        Maybe this is the problem. Because every merge kept hitting exceptions it looks like IW 442, even on close, was throwing the exceptions, and never closed properly.

        Can you try using IW.close(false) in your finally clause? This aborts all running merges and closes the index and likely will workaround whatever (seeming like a JRE bug) is causing the NPE during merging.

        Also, you could try using autoCommit=false.

        Show
        Michael McCandless added a comment - 1. To fix this issue if i disable the optimize (remove the call to IndexWriter.optimize() from my code) will that create any problem in the long run? if yes what kind of problems it may create? Most apps never need to optimize. You should only optimize if your search performance is not good enough, and really, before using optimize you should explore other ideas (see ImproveSearchingSpeed on the wiki). 2. Just by knowing the scenario, that after a NullPointerException, index file gets deleted, can you provide me a patch where any kind of exception is handled by the Lucene API and my index remains untouched? Can't make a patch here until we understand what's actually happening... It looks like your index got truncated at IW 443. Hmm actually it looks like IW 442 never successfully closed. (do a grep for ": at close:" and you'll see it's missing). Maybe this is the problem. Because every merge kept hitting exceptions it looks like IW 442, even on close, was throwing the exceptions, and never closed properly. Can you try using IW.close(false) in your finally clause? This aborts all running merges and closes the index and likely will workaround whatever (seeming like a JRE bug) is causing the NPE during merging. Also, you could try using autoCommit=false.
        Hide
        Ritesh Nigam added a comment -

        Thanks michael, the fix you provided seems to be working fine for me, after using close(false) index file is not getting deleted and search is also working fine though optimization is not happening but still its fine as far as search is working. I have attached a sample application which replicates the actual application, please have a look at that and let me know if anything else can be fine tuned.

        Show
        Ritesh Nigam added a comment - Thanks michael, the fix you provided seems to be working fine for me, after using close(false) index file is not getting deleted and search is also working fine though optimization is not happening but still its fine as far as search is working. I have attached a sample application which replicates the actual application, please have a look at that and let me know if anything else can be fine tuned.
        Hide
        Ritesh Nigam added a comment -

        Sample application which replicates the actual application.

        Show
        Ritesh Nigam added a comment - Sample application which replicates the actual application.
        Hide
        Erick Erickson added a comment -

        2013 Old JIRA cleanup

        Show
        Erick Erickson added a comment - 2013 Old JIRA cleanup

          People

          • Assignee:
            Unassigned
            Reporter:
            Ritesh Nigam
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development