Solr
  1. Solr
  2. SOLR-6347

'deletereplica' can throw a NullPointerException

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.9
    • Fix Version/s: 4.10, 5.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      Occasionally, but not always, when I invoke DELETEREPLICA I get a NPE. I suspect it is a race condition when the core finishes deleting while the overseer is checking for it?

      Client response:
      curl "http://localhost:8983/solr/admin/collections?action=DELETEREPLICA&collection=mycollection&shard=tmp_shard&replica=core_node1"
      <?xml version="1.0" encoding="UTF-8"?>
      <response>
      <lst name="responseHeader"><int name="status">500</int><int name="QTime">3712</int></lst><lst name="success"><lst><lst name="responseHeader"><int name="status">0</int><int name="QTime">27</int></lst></lst></lst><str name="Operation deletereplica caused exception:">java.lang.NullPointerException:java.lang.NullPointerException</str><lst name="exception"><null name="msg"/><int name="rspCode">-1</int></lst><lst name="error"><str name="trace">org.apache.solr.common.SolrException
      at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:364)
      at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320)
      at org.apache.solr.handler.admin.CollectionsHandler.handleRemoveReplica(CollectionsHandler.java:494)
      at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:184)
      at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
      at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
      at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
      at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
      at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
      at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
      at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
      at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
      at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
      at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
      at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
      at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
      at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
      at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
      at org.eclipse.jetty.server.Server.handle(Server.java:368)
      at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
      at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
      at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
      at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
      at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
      at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
      at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
      at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
      at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
      at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
      at java.lang.Thread.run(Thread.java:744)
      </str><int name="code">500</int></lst>
      </response>

      Server log:
      21:06:05.368 [OverseerThreadFactory-6-thread-5] WARN o.a.s.c.OverseerCollectionProcessor - OverseerCollectionProcessor.processMessage : deletereplica ,

      { "operation":"deletereplica", "collection":"mycollection", "shard":"tmp_shard", "replica":"core_node1"}

      21:06:05.602 [OverseerThreadFactory-6-thread-5] ERROR o.a.s.c.OverseerCollectionProcessor - Collection deletereplica of deletereplica failed:java.lang.NullPointerException
      at org.apache.solr.cloud.OverseerCollectionProcessor.waitForCoreNodeGone(OverseerCollectionProcessor.java:911)
      at org.apache.solr.cloud.OverseerCollectionProcessor.deleteReplica(OverseerCollectionProcessor.java:899)
      at org.apache.solr.cloud.OverseerCollectionProcessor.processMessage(OverseerCollectionProcessor.java:573)
      at org.apache.solr.cloud.OverseerCollectionProcessor$Runner.run(OverseerCollectionProcessor.java:2619)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:744)

      1. SOLR-6347.patch
        7 kB
        Anshum Gupta
      2. SOLR-6347.patch
        1 kB
        Anshum Gupta

        Activity

        Hide
        Anshum Gupta added a comment -

        The NPE seems to be thrown when the last replica of a an implicit routed shard (collection) is deleted.
        Added some checks in there to avoid the NPE. Would add some tests and commit.

        Show
        Anshum Gupta added a comment - The NPE seems to be thrown when the last replica of a an implicit routed shard (collection) is deleted. Added some checks in there to avoid the NPE. Would add some tests and commit.
        Hide
        Anshum Gupta added a comment -

        Fix and tests

        Show
        Anshum Gupta added a comment - Fix and tests
        Hide
        ASF subversion and git services added a comment -

        Commit 1617673 from Anshum Gupta in branch 'dev/trunk'
        [ https://svn.apache.org/r1617673 ]

        SOLR-6347: Fix NPE during last replica deletion for custom sharded collections using DELETEREPLICA

        Show
        ASF subversion and git services added a comment - Commit 1617673 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1617673 ] SOLR-6347 : Fix NPE during last replica deletion for custom sharded collections using DELETEREPLICA
        Hide
        ASF subversion and git services added a comment -

        Commit 1617678 from Anshum Gupta in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1617678 ]

        SOLR-6347: Fix NPE during last replica deletion for custom sharded collections using DELETEREPLICA (Merge from trunk r1617673)

        Show
        ASF subversion and git services added a comment - Commit 1617678 from Anshum Gupta in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1617678 ] SOLR-6347 : Fix NPE during last replica deletion for custom sharded collections using DELETEREPLICA (Merge from trunk r1617673)
        Hide
        ASF subversion and git services added a comment -

        Commit 1617972 from Anshum Gupta in branch 'dev/trunk'
        [ https://svn.apache.org/r1617972 ]

        SOLR-6347: Fixing CHANGES.txt entry

        Show
        ASF subversion and git services added a comment - Commit 1617972 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1617972 ] SOLR-6347 : Fixing CHANGES.txt entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1617973 from Anshum Gupta in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1617973 ]

        SOLR-6347: Fixing CHANGES.txt entry (Merging from trunk)

        Show
        ASF subversion and git services added a comment - Commit 1617973 from Anshum Gupta in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1617973 ] SOLR-6347 : Fixing CHANGES.txt entry (Merging from trunk)
        Hide
        Shalin Shekhar Mangar added a comment -

        The test added here has been failing a lot on jenkins. Something is wrong here. Also, there's no fix version for this bug?

        Show
        Shalin Shekhar Mangar added a comment - The test added here has been failing a lot on jenkins. Something is wrong here. Also, there's no fix version for this bug?
        Hide
        ASF subversion and git services added a comment -

        Commit 1627122 from Noble Paul in branch 'dev/trunk'
        [ https://svn.apache.org/r1627122 ]

        related to the failures seen here SOLR-6347

        Show
        ASF subversion and git services added a comment - Commit 1627122 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1627122 ] related to the failures seen here SOLR-6347
        Hide
        ASF subversion and git services added a comment -

        Commit 1627214 from Noble Paul in branch 'dev/trunk'
        [ https://svn.apache.org/r1627214 ]

        diabling the test till it is resolved SOLR-6347

        Show
        ASF subversion and git services added a comment - Commit 1627214 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1627214 ] diabling the test till it is resolved SOLR-6347
        Hide
        ASF subversion and git services added a comment -

        Commit 1627215 from Noble Paul in branch 'dev/trunk'
        [ https://svn.apache.org/r1627215 ]

        diabling the test till it is resolved SOLR-6347

        Show
        ASF subversion and git services added a comment - Commit 1627215 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1627215 ] diabling the test till it is resolved SOLR-6347
        Hide
        Anshum Gupta added a comment - - edited

        Shalin Shekhar Mangar I think the test didn't fail until the 4x -> 5x and trunk changes happened (or something else that was committed at around the same time) . Something changed that made this and DeleteReplicaTest fail consistently. I'll try and have a look at it. Also, this is in the CHANGE list for 4.10, should we update that here?

        Also, I think it'd be good to create another issue to handle the failing Delete*ReplicaTest failures.

        Show
        Anshum Gupta added a comment - - edited Shalin Shekhar Mangar I think the test didn't fail until the 4x -> 5x and trunk changes happened (or something else that was committed at around the same time) . Something changed that made this and DeleteReplicaTest fail consistently. I'll try and have a look at it. Also, this is in the CHANGE list for 4.10, should we update that here? Also, I think it'd be good to create another issue to handle the failing Delete*ReplicaTest failures.
        Hide
        Noble Paul added a comment -

        makes sense Anshum Gupta

        Show
        Noble Paul added a comment - makes sense Anshum Gupta
        Hide
        Anshum Gupta added a comment -

        Created SOLR-6593 for the failing Delete*ReplicaTest issue.

        Show
        Anshum Gupta added a comment - Created SOLR-6593 for the failing Delete*ReplicaTest issue.
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Anshum Gupta
            Reporter:
            Ralph Tice
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development