Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8223

Take care not to accidentally swallow OOMErrors

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.10.3
    • Fix Version/s: 5.4, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      This was first noticed with 4.10.3, but it looks like it still applies to trunk. There are a few places in the code where we catch Throwable and then don't check for OOM or rethrow it. This behaviour means that OOM kill scripts won't run, and the JVM can get into an undesirable state.

        Activity

        Hide
        mdrob Mike Drob added a comment -

        Attaching a patch that fixes an instance in CoreContainer and LIRThread.

        Show
        mdrob Mike Drob added a comment - Attaching a patch that fixes an instance in CoreContainer and LIRThread .
        Hide
        cpoerschke Christine Poerschke added a comment -

        Small change, looks reasonable to me.

        re: the CoreContainer change being to RuntimeException whereas the LeaderInitiatedRecoveryThread change is to Exception - just curious re: why the difference? The class hierarchy being Throwable -> Error -> VirtualMachineError -> OutOfMemoryError and Throwable -> Exception -> RuntimeException both changes will stop swallowing the OOMError but the CoreContainer will also stop catching non-RuntimeException Exceptions, so just curious re: the distinction.

        Show
        cpoerschke Christine Poerschke added a comment - Small change, looks reasonable to me. re: the CoreContainer change being to RuntimeException whereas the LeaderInitiatedRecoveryThread change is to Exception - just curious re: why the difference? The class hierarchy being Throwable -> Error -> VirtualMachineError -> OutOfMemoryError and Throwable -> Exception -> RuntimeException both changes will stop swallowing the OOMError but the CoreContainer will also stop catching non-RuntimeException Exceptions, so just curious re: the distinction.
        Hide
        mdrob Mike Drob added a comment -

        In CoreContainer, zkSys.registerInZk doesn't declare any checked exceptions, so we can narrow the scope to catching only unchecked ones. This might help us notice API or compatibility changes in the future.

        In LeaderInitiatedRecoveryThread, client.request declares both SolrServerException and IOException so we still have to handle them in some way. The existing code did not make a distinction between those and RuntimeException, so I made the minimal changes I could. It would have been reasonable to catch (IOException | SolrServerException | RuntimeException e) but I didn't think of that at the time and I'm not sure it really adds much value.

        Show
        mdrob Mike Drob added a comment - In CoreContainer , zkSys.registerInZk doesn't declare any checked exceptions, so we can narrow the scope to catching only unchecked ones. This might help us notice API or compatibility changes in the future. In LeaderInitiatedRecoveryThread , client.request declares both SolrServerException and IOException so we still have to handle them in some way. The existing code did not make a distinction between those and RuntimeException , so I made the minimal changes I could. It would have been reasonable to catch (IOException | SolrServerException | RuntimeException e) but I didn't think of that at the time and I'm not sure it really adds much value.
        Hide
        cpoerschke Christine Poerschke added a comment -

        Makes sense. Thanks for explaining.

        Show
        cpoerschke Christine Poerschke added a comment - Makes sense. Thanks for explaining.
        Hide
        mdrob Mike Drob added a comment -

        Christine Poerschke - do you think this is good to commit, or are there other changes you think it should have? I tried to figure out what kind of meaningful tests I could add, but couldn't come up with anything that was non-trivial and useful.

        Show
        mdrob Mike Drob added a comment - Christine Poerschke - do you think this is good to commit, or are there other changes you think it should have? I tried to figure out what kind of meaningful tests I could add, but couldn't come up with anything that was non-trivial and useful.
        Hide
        cpoerschke Christine Poerschke added a comment -

        I think this is good to commit. (Won't get to it today though.)

        Show
        cpoerschke Christine Poerschke added a comment - I think this is good to commit. (Won't get to it today though.)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1712817 from Christine Poerschke in branch 'dev/trunk'
        [ https://svn.apache.org/r1712817 ]

        SOLR-8223: Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1712817 from Christine Poerschke in branch 'dev/trunk' [ https://svn.apache.org/r1712817 ] SOLR-8223 : Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1712842 from Christine Poerschke in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1712842 ]

        SOLR-8223: Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java) (merge in revision 1712817 from trunk)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1712842 from Christine Poerschke in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1712842 ] SOLR-8223 : Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java) (merge in revision 1712817 from trunk)
        Hide
        cpoerschke Christine Poerschke added a comment -

        Thanks Mike!

        Show
        cpoerschke Christine Poerschke added a comment - Thanks Mike!

          People

          • Assignee:
            cpoerschke Christine Poerschke
            Reporter:
            mdrob Mike Drob
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development