Lucene - Core
  1. Lucene - Core
  2. LUCENE-2476

Constructor of IndexWriter let's runtime exceptions pop up, while keeping the writeLock obtained

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.0.1
    • Fix Version/s: 2.9.3, 3.0.2, 3.1, 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Constructor of IndexWriter let's runtime exceptions pop up, while keeping the writeLock obtained.

      The init method in IndexWriter catches IOException only (I got NegativeArraySize by reading up a corrupt index), and now, there is no way to recover, since the writeLock will be kept obtained. Moreover, I don't have IndexWriter instance either, to "grab" the lock somehow, since the init() method is called from IndexWriter constructor.

      Either broaden the catch to all exceptions, or at least provide some circumvention to clear up. In my case, I'd like to "fallback", just delete the corrupted index from disk and recreate it, but it is impossible, since the LOCK_HELD NativeFSLockFactory's entry about obtained WriteLock is never cleaned out and is no (at least apparent) way to clean it out forcibly. I can't create new IndexWriter, since it will always fail with LockObtainFailedException.

      1. LUCENE-2476.patch
        2 kB
        Michael McCandless

        Activity

        Hide
        Shai Erera added a comment -

        Can you post here the full stacktrace?

        Show
        Shai Erera added a comment - Can you post here the full stacktrace?
        Hide
        Michael McCandless added a comment -

        I agree, we should fix this. I'll change to a try/finally w/ a success boolean.

        You can use IndexWriter#unlock to forcefully remove the lock, as a workaround.

        Show
        Michael McCandless added a comment - I agree, we should fix this. I'll change to a try/finally w/ a success boolean. You can use IndexWriter#unlock to forcefully remove the lock, as a workaround.
        Hide
        Cservenak, Tamas added a comment -

        I tried both IndexWriter#unlock and Directory#cleanLock(IndexWriter.WRITE_LOCK_NAME) but non of those removed the entry from LOCK_HELD HashSet. It was unchanged.

        The NativeFSLock#release() was returning false in both cases.

        So, this is what I meant by "provide some circumvention", since up to now, I did not figure out any other means to remove the entry from LOCK_HELD. All of these did not remove it.

        Show
        Cservenak, Tamas added a comment - I tried both IndexWriter#unlock and Directory#cleanLock(IndexWriter.WRITE_LOCK_NAME) but non of those removed the entry from LOCK_HELD HashSet. It was unchanged. The NativeFSLock#release() was returning false in both cases. So, this is what I meant by "provide some circumvention", since up to now, I did not figure out any other means to remove the entry from LOCK_HELD. All of these did not remove it.
        Hide
        Michael McCandless added a comment -

        Patch.

        Show
        Michael McCandless added a comment - Patch.
        Hide
        Michael McCandless added a comment -

        I tried both IndexWriter#unlock and Directory#cleanLock(IndexWriter.WRITE_LOCK_NAME) but non of those removed the entry from LOCK_HELD HashSet. It was unchanged.

        Ahh, sorry, I think you are hitting LUCENE-2104.

        Show
        Michael McCandless added a comment - I tried both IndexWriter#unlock and Directory#cleanLock(IndexWriter.WRITE_LOCK_NAME) but non of those removed the entry from LOCK_HELD HashSet. It was unchanged. Ahh, sorry, I think you are hitting LUCENE-2104 .
        Hide
        Cservenak, Tamas added a comment -

        Just to confirm this patch as fix.

        The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes.

        Show
        Cservenak, Tamas added a comment - Just to confirm this patch as fix. The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes.
        Hide
        Cservenak, Tamas added a comment -

        Yes, I do hit LUCENE-2104 at the same time... nice.

        Show
        Cservenak, Tamas added a comment - Yes, I do hit LUCENE-2104 at the same time... nice.
        Hide
        Shai Erera added a comment -

        Out of curiosity - would you mind posting here the exception?

        Show
        Shai Erera added a comment - Out of curiosity - would you mind posting here the exception?
        Hide
        Cservenak, Tamas added a comment -

        This is an UT, that 1st copies a known (broken) Index files to a place, and than tries to use it. Naturally, it fails (since the index files are corrupted), and then it tries to recreate the index files and recreate the index content, but it fails to obtain the write lock again. After patch above applied to 3.0.1, the UT does pass okay.

        This is the stack trace I have with vanilla 3.0.1:

        org.sonatype.timeline.TimelineException: Fail to configure timeline index!
        	at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:106)
        	at org.sonatype.timeline.DefaultTimeline.repairTimelineIndexer(DefaultTimeline.java:79)
        	at org.sonatype.timeline.DefaultTimeline.configure(DefaultTimeline.java:60)
        	at org.sonatype.timeline.TimelineTest.testRepairIndexCouldNotRead(TimelineTest.java:103)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at java.lang.reflect.Method.invoke(Method.java:597)
        	at junit.framework.TestCase.runTest(TestCase.java:164)
        	at junit.framework.TestCase.runBare(TestCase.java:130)
        	at junit.framework.TestResult$1.protect(TestResult.java:106)
        	at junit.framework.TestResult.runProtected(TestResult.java:124)
        	at junit.framework.TestResult.run(TestResult.java:109)
        	at junit.framework.TestCase.run(TestCase.java:120)
        	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
        	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
        Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/Users/cstamas/worx/sonatype/spice/trunk/spice-timeline/target/index/write.lock
        	at org.apache.lucene.store.Lock.obtain(Lock.java:84)
        	at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1045)
        	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:868)
        	at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:99)
        	... 19 more
        
        Show
        Cservenak, Tamas added a comment - This is an UT, that 1st copies a known (broken) Index files to a place, and than tries to use it. Naturally, it fails (since the index files are corrupted), and then it tries to recreate the index files and recreate the index content, but it fails to obtain the write lock again. After patch above applied to 3.0.1, the UT does pass okay. This is the stack trace I have with vanilla 3.0.1: org.sonatype.timeline.TimelineException: Fail to configure timeline index! at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:106) at org.sonatype.timeline.DefaultTimeline.repairTimelineIndexer(DefaultTimeline.java:79) at org.sonatype.timeline.DefaultTimeline.configure(DefaultTimeline.java:60) at org.sonatype.timeline.TimelineTest.testRepairIndexCouldNotRead(TimelineTest.java:103) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:164) at junit.framework.TestCase.runBare(TestCase.java:130) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:120) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/Users/cstamas/worx/sonatype/spice/trunk/spice-timeline/target/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1045) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:868) at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:99) ... 19 more
        Hide
        Michael McCandless added a comment -

        The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes.

        OK thanks for confirming – I'll backport to 3.0.x as well.

        (Yes patch is against trunk).

        Show
        Michael McCandless added a comment - The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes. OK thanks for confirming – I'll backport to 3.0.x as well. (Yes patch is against trunk).
        Hide
        Michael McCandless added a comment -

        The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes.

        OK thanks for confirming – I'll backport to 3.0.x as well.

        (Yes patch is against trunk).

        Show
        Michael McCandless added a comment - The patch applied to 3.0.1 (I had to do it manually, since I believe this patch is against trunk, not 3.0.1) does fix my problem. The IndexWriter is now successfully recreated and my UT does recover just fine from corrupted indexes. OK thanks for confirming – I'll backport to 3.0.x as well. (Yes patch is against trunk).
        Hide
        Shai Erera added a comment -

        This exception shows a LockObtainFailed exception - can you post the one that resulted in NegativeArraySize – curious to know where you hit it, and what sort of corruption yields to that .

        Show
        Shai Erera added a comment - This exception shows a LockObtainFailed exception - can you post the one that resulted in NegativeArraySize – curious to know where you hit it, and what sort of corruption yields to that .
        Hide
        Cservenak, Tamas added a comment -

        This is a Lucene index known to be corrupt (got from a "live" Nexus or just "breaking" it manually by tampering with hex editor, not remember anymore). The Lucene used to create this index is 2.3.2, so during this UT I believe an index upgrade happens too.

        [INFO] Failed to configure timeline index, trying to repair it.
        org.sonatype.timeline.TimelineException: Fail to configure timeline index!
        	at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:107)
        	at org.sonatype.timeline.DefaultTimeline.configure(DefaultTimeline.java:49)
        	at org.sonatype.timeline.TimelineTest.testRepairIndexCouldNotRead(TimelineTest.java:103)
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        	at java.lang.reflect.Method.invoke(Method.java:597)
        	at junit.framework.TestCase.runTest(TestCase.java:164)
        	at junit.framework.TestCase.runBare(TestCase.java:130)
        	at junit.framework.TestResult$1.protect(TestResult.java:106)
        	at junit.framework.TestResult.runProtected(TestResult.java:124)
        	at junit.framework.TestResult.run(TestResult.java:109)
        	at junit.framework.TestCase.run(TestCase.java:120)
        	at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
        	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
        	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
        Caused by: java.lang.NegativeArraySizeException
        	at org.apache.lucene.store.IndexInput.readString(IndexInput.java:126)
        	at org.apache.lucene.index.SegmentInfo.<init>(SegmentInfo.java:173)
        	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:258)
        	at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:312)
        	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677)
        	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:521)
        	at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:308)
        	at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1076)
        	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:868)
        	at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:100)
        	... 18 more
        
        Show
        Cservenak, Tamas added a comment - This is a Lucene index known to be corrupt (got from a "live" Nexus or just "breaking" it manually by tampering with hex editor, not remember anymore). The Lucene used to create this index is 2.3.2, so during this UT I believe an index upgrade happens too. [INFO] Failed to configure timeline index, trying to repair it. org.sonatype.timeline.TimelineException: Fail to configure timeline index! at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:107) at org.sonatype.timeline.DefaultTimeline.configure(DefaultTimeline.java:49) at org.sonatype.timeline.TimelineTest.testRepairIndexCouldNotRead(TimelineTest.java:103) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:164) at junit.framework.TestCase.runBare(TestCase.java:130) at junit.framework.TestResult$1.protect(TestResult.java:106) at junit.framework.TestResult.runProtected(TestResult.java:124) at junit.framework.TestResult.run(TestResult.java:109) at junit.framework.TestCase.run(TestCase.java:120) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130) at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197) Caused by: java.lang.NegativeArraySizeException at org.apache.lucene.store.IndexInput.readString(IndexInput.java:126) at org.apache.lucene.index.SegmentInfo.<init>(SegmentInfo.java:173) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:258) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:312) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:521) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:308) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1076) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:868) at org.sonatype.timeline.DefaultTimelineIndexer.configure(DefaultTimelineIndexer.java:100) ... 18 more
        Hide
        Uwe Schindler added a comment -

        Merged to 2.9 revision: 949507

        Show
        Uwe Schindler added a comment - Merged to 2.9 revision: 949507

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Cservenak, Tamas
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development