Lucene - Core
  1. Lucene - Core
  2. LUCENE-4083

RateLimited.pause() throws unchecked exception

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The while() loop in RateLimiter.pause() invokes Thread.sleep() with potentially large values, which occasionally results in InterruptedException being thrown from Thread.sleep(). This is wrapped in an unchecked ThreadInterruptedException and re-thrown, and results in high-level errors like this:

          [junit] 2012-05-29 15:50:15,464 ERROR core.SolrCore - org.apache.lucene.util.ThreadInterruptedException: java.lang.InterruptedException: sleep interrupted
          [junit] 	at org.apache.lucene.store.RateLimiter.pause(RateLimiter.java:82)
          [junit] 	at org.apache.lucene.store.MockIndexOutputWrapper.writeBytes(MockIndexOutputWrapper.java:82)
          [junit] 	at org.apache.lucene.store.MockIndexOutputWrapper.writeByte(MockIndexOutputWrapper.java:73)
          [junit] 	at org.apache.lucene.store.DataOutput.writeVInt(DataOutput.java:191)
          [junit] 	at org.apache.lucene.codecs.lucene40.Lucene40PostingsWriter.addPosition(Lucene40PostingsWriter.java:237)
          [junit] 	at org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:519)
          [junit] 	at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:92)
          [junit] 	at org.apache.lucene.index.TermsHash.flush(TermsHash.java:117)
          [junit] 	at org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
          [junit] 	at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
          [junit] 	at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:475)
          [junit] 	at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:422)
          [junit] 	at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:553)
          [junit] 	at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2416)
          [junit] 	at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2548)
          [junit] 	at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2530)
          [junit] 	at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:414)
          [junit] 	at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82)
      

      I believe this is a bug - the while() loop already ensures that the total time spent in pause() is correct even if InterruptedException-s are thrown, so they should not be re-thrown.

      The patch is trivial - simply don't re-throw.

        Activity

        Hide
        Michael McCandless added a comment -

        But, shouldn't we re-throw? Ie, you interrupt so that the thread stops sleeping and throws an exception?

        Something somewhere higher up must catch the ThreadInterruptedException I think?

        Show
        Michael McCandless added a comment - But, shouldn't we re-throw? Ie, you interrupt so that the thread stops sleeping and throws an exception? Something somewhere higher up must catch the ThreadInterruptedException I think?
        Hide
        Dawid Weiss added a comment -

        Thread.sleep() with potentially large values, which occasionally results in InterruptedException being thrown from Thread.sleep()

        There are no "spurious interrupts" from sleep... something has interrupted that thread so just not re-throwing is probably not a good idea?

        Show
        Dawid Weiss added a comment - Thread.sleep() with potentially large values, which occasionally results in InterruptedException being thrown from Thread.sleep() There are no "spurious interrupts" from sleep... something has interrupted that thread so just not re-throwing is probably not a good idea?
        Hide
        Andrzej Bialecki added a comment -

        Hmm, right ... something interrupted here, so it shouldn't be ignored. Still, I think this exception should have been caught somewhere higher and dealt with e.g. at the level of IndexWriter or in Solr. Reporting this to user doesn't seem friendly or useful. Since it results in a failed commit, why not catch it and say so (and re-throw as a checked exception).

        Show
        Andrzej Bialecki added a comment - Hmm, right ... something interrupted here, so it shouldn't be ignored. Still, I think this exception should have been caught somewhere higher and dealt with e.g. at the level of IndexWriter or in Solr. Reporting this to user doesn't seem friendly or useful. Since it results in a failed commit, why not catch it and say so (and re-throw as a checked exception).
        Hide
        Dawid Weiss added a comment -

        Andrzej can you include the full stack trace? Did this happen when testing (I guess)?

        Show
        Dawid Weiss added a comment - Andrzej can you include the full stack trace? Did this happen when testing (I guess)?
        Hide
        Andrzej Bialecki added a comment -

        Yes, it happened during testing, at a high CPU load. Unfortunately I ran the tests again, so the original log is not available anymore, and the problem didn't occur ... I'll add the full stack trace when this appears again.

        Show
        Andrzej Bialecki added a comment - Yes, it happened during testing, at a high CPU load. Unfortunately I ran the tests again, so the original log is not available anymore, and the problem didn't occur ... I'll add the full stack trace when this appears again.
        Hide
        Michael McCandless added a comment -

        Still, I think this exception should have been caught somewhere higher and dealt with e.g. at the level of IndexWriter or in Solr.

        I agree: I think in general whoever does the interrupting (test framework in this case?) should somehow handle the resulting thrown interrupted exc...

        Show
        Michael McCandless added a comment - Still, I think this exception should have been caught somewhere higher and dealt with e.g. at the level of IndexWriter or in Solr. I agree: I think in general whoever does the interrupting (test framework in this case?) should somehow handle the resulting thrown interrupted exc...
        Hide
        Dawid Weiss added a comment -

        I agree: I think in general whoever does the interrupting (test framework in this case?) should somehow handle the resulting thrown interrupted exc...

        As far as I remember the test framework does not report exceptions after any of the following happens:
        1) the test thread timeouts,
        2) the test thread terminates but a thread leak has been detected (this is an error on its own).

        I'll verify if these conditions hold.

        Show
        Dawid Weiss added a comment - I agree: I think in general whoever does the interrupting (test framework in this case?) should somehow handle the resulting thrown interrupted exc... As far as I remember the test framework does not report exceptions after any of the following happens: 1) the test thread timeouts, 2) the test thread terminates but a thread leak has been detected (this is an error on its own). I'll verify if these conditions hold.
        Hide
        Andrzej Bialecki added a comment -

        I can't reproduce this, and anyway very likely it was just a test framework trying to interrupt all leftover threads.

        Show
        Andrzej Bialecki added a comment - I can't reproduce this, and anyway very likely it was just a test framework trying to interrupt all leftover threads.
        Hide
        Dawid Weiss added a comment -

        Just a comment on the belated follow-up check that I did. If the main test's thread times out, any forked threads will also be interrupted/ terminated as a consequence of this. Any exceptions that happen after the main thread is terminated will be wrapped as in here (note "after termination attempt").

        java.lang.RuntimeException: Thread threw an uncaught exception (after termination attempt), thread: Thread[spinning,5,]
        	at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:97)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner.access$7(RandomizedRunner.java:804)
        	at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:671)
        	at java.lang.Thread.run(Thread.java:662)
        Caused by: java.lang.RuntimeException: foobar
        	at __randomizedtesting.SeedInfo.seed([54B0307E3EA5B518]:0)
        	at com.carrotsearch.randomizedtesting.TestTestTimeoutAndRunawayThreadException$Nested$1.run(TestTestTimeoutAndRunawayThreadException.java:22)
        

        So I don't think the above exception was a result of after-the-timeout exception, rather something else.

        Show
        Dawid Weiss added a comment - Just a comment on the belated follow-up check that I did. If the main test's thread times out, any forked threads will also be interrupted/ terminated as a consequence of this. Any exceptions that happen after the main thread is terminated will be wrapped as in here (note "after termination attempt"). java.lang.RuntimeException: Thread threw an uncaught exception (after termination attempt), thread: Thread [spinning,5,] at com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:97) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$7(RandomizedRunner.java:804) at com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:671) at java.lang. Thread .run( Thread .java:662) Caused by: java.lang.RuntimeException: foobar at __randomizedtesting.SeedInfo.seed([54B0307E3EA5B518]:0) at com.carrotsearch.randomizedtesting.TestTestTimeoutAndRunawayThreadException$Nested$1.run(TestTestTimeoutAndRunawayThreadException.java:22) So I don't think the above exception was a result of after-the-timeout exception, rather something else.
        Hide
        Dawid Weiss added a comment -

        I think I must back off the above – I think there is a possibility in which a leaked background thread would slip through. I've simulated this behavior but I'll need a refactoring to clean this up; thread leak management has become very hairy.

        I'm letting this issue go away for now, I'll get back to leaked threads though (soon) and I'm sure the problem will resurface then if it really is a problem.

        Show
        Dawid Weiss added a comment - I think I must back off the above – I think there is a possibility in which a leaked background thread would slip through. I've simulated this behavior but I'll need a refactoring to clean this up; thread leak management has become very hairy. I'm letting this issue go away for now, I'll get back to leaked threads though (soon) and I'm sure the problem will resurface then if it really is a problem.

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrzej Bialecki
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development