Solr
  1. Solr
  2. SOLR-8223

Take care not to accidentally swallow OOMErrors

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.10.3
    • Fix Version/s: 5.4, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      This was first noticed with 4.10.3, but it looks like it still applies to trunk. There are a few places in the code where we catch Throwable and then don't check for OOM or rethrow it. This behaviour means that OOM kill scripts won't run, and the JVM can get into an undesirable state.

        Activity

        Hide
        Mike Drob added a comment -

        Attaching a patch that fixes an instance in CoreContainer and LIRThread.

        Show
        Mike Drob added a comment - Attaching a patch that fixes an instance in CoreContainer and LIRThread .
        Hide
        Christine Poerschke added a comment -

        Small change, looks reasonable to me.

        re: the CoreContainer change being to RuntimeException whereas the LeaderInitiatedRecoveryThread change is to Exception - just curious re: why the difference? The class hierarchy being Throwable -> Error -> VirtualMachineError -> OutOfMemoryError and Throwable -> Exception -> RuntimeException both changes will stop swallowing the OOMError but the CoreContainer will also stop catching non-RuntimeException Exceptions, so just curious re: the distinction.

        Show
        Christine Poerschke added a comment - Small change, looks reasonable to me. re: the CoreContainer change being to RuntimeException whereas the LeaderInitiatedRecoveryThread change is to Exception - just curious re: why the difference? The class hierarchy being Throwable -> Error -> VirtualMachineError -> OutOfMemoryError and Throwable -> Exception -> RuntimeException both changes will stop swallowing the OOMError but the CoreContainer will also stop catching non-RuntimeException Exceptions, so just curious re: the distinction.
        Hide
        Mike Drob added a comment -

        In CoreContainer, zkSys.registerInZk doesn't declare any checked exceptions, so we can narrow the scope to catching only unchecked ones. This might help us notice API or compatibility changes in the future.

        In LeaderInitiatedRecoveryThread, client.request declares both SolrServerException and IOException so we still have to handle them in some way. The existing code did not make a distinction between those and RuntimeException, so I made the minimal changes I could. It would have been reasonable to catch (IOException | SolrServerException | RuntimeException e) but I didn't think of that at the time and I'm not sure it really adds much value.

        Show
        Mike Drob added a comment - In CoreContainer , zkSys.registerInZk doesn't declare any checked exceptions, so we can narrow the scope to catching only unchecked ones. This might help us notice API or compatibility changes in the future. In LeaderInitiatedRecoveryThread , client.request declares both SolrServerException and IOException so we still have to handle them in some way. The existing code did not make a distinction between those and RuntimeException , so I made the minimal changes I could. It would have been reasonable to catch (IOException | SolrServerException | RuntimeException e) but I didn't think of that at the time and I'm not sure it really adds much value.
        Hide
        Christine Poerschke added a comment -

        Makes sense. Thanks for explaining.

        Show
        Christine Poerschke added a comment - Makes sense. Thanks for explaining.
        Hide
        Mike Drob added a comment -

        Christine Poerschke - do you think this is good to commit, or are there other changes you think it should have? I tried to figure out what kind of meaningful tests I could add, but couldn't come up with anything that was non-trivial and useful.

        Show
        Mike Drob added a comment - Christine Poerschke - do you think this is good to commit, or are there other changes you think it should have? I tried to figure out what kind of meaningful tests I could add, but couldn't come up with anything that was non-trivial and useful.
        Hide
        Christine Poerschke added a comment -

        I think this is good to commit. (Won't get to it today though.)

        Show
        Christine Poerschke added a comment - I think this is good to commit. (Won't get to it today though.)
        Hide
        ASF subversion and git services added a comment -

        Commit 1712817 from Christine Poerschke in branch 'dev/trunk'
        [ https://svn.apache.org/r1712817 ]

        SOLR-8223: Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java)

        Show
        ASF subversion and git services added a comment - Commit 1712817 from Christine Poerschke in branch 'dev/trunk' [ https://svn.apache.org/r1712817 ] SOLR-8223 : Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java)
        Hide
        ASF subversion and git services added a comment -

        Commit 1712842 from Christine Poerschke in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1712842 ]

        SOLR-8223: Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java) (merge in revision 1712817 from trunk)

        Show
        ASF subversion and git services added a comment - Commit 1712842 from Christine Poerschke in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1712842 ] SOLR-8223 : Avoid accidentally swallowing OutOfMemoryError (in LeaderInitiatedRecoveryThread.java or CoreContainer.java) (merge in revision 1712817 from trunk)
        Hide
        Christine Poerschke added a comment -

        Thanks Mike!

        Show
        Christine Poerschke added a comment - Thanks Mike!

          People

          • Assignee:
            Christine Poerschke
            Reporter:
            Mike Drob
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development