Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7038

If no configset exists, CREATE leads to a 500 error with never-ending logging and 100% CPU usage

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 4.10.4, 5.0, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Here's what I did:

      > bin/solr start -e cloud -noprompt
      
      > curl http://localhost:8983/solr/admin/collections?action=CREATE&name=thisshouldfail&numShards=1&configset=thisisaninvalidconfigset&wt=json
      

      The above led to a new collection named thisshouldfail, with the config-set as gettingstarted. This call should have failed as there was no configset by that name. Instead, it picked up the only config set it found and used it.

      There's more to this. I'm not sure how related this is but looks like it to me.

      > bin/solr start -c
      
      > curl http://localhost:8983/solr/admin/collections?action=CREATE&name=thisshouldfail&numShards=1&configset=thisisaninvalidconfigset&wt=json
      

      This led to a 900M (and growing) log file in addition to 100% CPU until I killed Solr.

      1. SOLR-7038.patch
        4 kB
        Mark Miller
      2. SOLR-7038.patch
        3 kB
        Anshum Gupta

        Activity

        Hide
        sarowe@syr.edu Use account "steve_rowe" instead added a comment -

        I can repro both conditions on trunk from the source after running ant server.

        Show
        sarowe@syr.edu Use account "steve_rowe" instead added a comment - I can repro both conditions on trunk from the source after running ant server .
        Hide
        anshumg Anshum Gupta added a comment -

        rookie error... while trying to create the collection, for some reason I passed the collection name as 'configset=' instead of 'collection.configName='

        The first use case is pretty much invalid but anything taking up 100% CPU and logs that grow to a few GBs in no time is bad so I'm looking at that now.

        Show
        anshumg Anshum Gupta added a comment - rookie error... while trying to create the collection, for some reason I passed the collection name as 'configset=' instead of 'collection.configName=' The first use case is pretty much invalid but anything taking up 100% CPU and logs that grow to a few GBs in no time is bad so I'm looking at that now.
        Hide
        anshumg Anshum Gupta added a comment -

        CREATE call without any configset in SolrCloud:
        http://localhost:8983/solr/admin/collections?action=CREATE&name=thisshouldfail&numShards=1&collection.configName=thisisaninvalidconfigset

        Here's the exception that get's logged over and over again:

        INFO  - 2015-01-26 21:54:16.536; org.apache.solr.cloud.overseer.ClusterStateMutator; building a new cName: c1
        INFO  - 2015-01-26 21:54:16.536; org.apache.solr.cloud.overseer.ZkStateWriter; going to create_collection /collections/c1/state.json
        ERROR - 2015-01-26 21:54:16.537; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main queue loop
        org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/c1/state.json
        	at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
        	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
        	at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:379)
        	at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:376)
        	at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61)
        	at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:376)
        	at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:163)
        	at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:91)
        	at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:337)
        	at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:247)
        	at java.lang.Thread.run(Thread.java:745)
        

        and the trace from the response:

        trace": "org.apache.solr.common.SolrException\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:737)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:693)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:870)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:188)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)\n\tat org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:736)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:497)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)\n\tat org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)\n\tat java.lang.Thread.run(Thread.java:745)\n",
        "code": 500
        
        Show
        anshumg Anshum Gupta added a comment - CREATE call without any configset in SolrCloud: http://localhost:8983/solr/admin/collections?action=CREATE&name=thisshouldfail&numShards=1&collection.configName=thisisaninvalidconfigset Here's the exception that get's logged over and over again: INFO - 2015-01-26 21:54:16.536; org.apache.solr.cloud.overseer.ClusterStateMutator; building a new cName: c1 INFO - 2015-01-26 21:54:16.536; org.apache.solr.cloud.overseer.ZkStateWriter; going to create_collection /collections/c1/state.json ERROR - 2015-01-26 21:54:16.537; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Exception in Overseer main queue loop org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/c1/state.json at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:379) at org.apache.solr.common.cloud.SolrZkClient$9.execute(SolrZkClient.java:376) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:61) at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:376) at org.apache.solr.cloud.overseer.ZkStateWriter.writePendingUpdates(ZkStateWriter.java:163) at org.apache.solr.cloud.overseer.ZkStateWriter.enqueueUpdate(ZkStateWriter.java:91) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processQueueItem(Overseer.java:337) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:247) at java.lang. Thread .run( Thread .java:745) and the trace from the response: trace ": " org.apache.solr.common.SolrException\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:737)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:693)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:870)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:188)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)\n\tat org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:736)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:261)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:204)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:497)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:313)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)\n\tat org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:626)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:546)\n\tat java.lang. Thread .run( Thread .java:745)\n", "code" : 500
        Hide
        anshumg Anshum Gupta added a comment -

        I've manually tested this patch using the scripts. Currently running the tests.

        This validates the config set (presence of one) and returns and exception if there's no config with the specified/derived name.

        Show
        anshumg Anshum Gupta added a comment - I've manually tested this patch using the scripts. Currently running the tests. This validates the config set (presence of one) and returns and exception if there's no config with the specified/derived name.
        Hide
        anshumg Anshum Gupta added a comment -

        I have the friendly (thanks to EZMock) OCPTest failing because of this one. Any help on this would be great. Mark Miller ?

        Show
        anshumg Anshum Gupta added a comment - I have the friendly (thanks to EZMock) OCPTest failing because of this one. Any help on this would be great. Mark Miller ?
        Hide
        markrmiller@gmail.com Mark Miller added a comment -

        Patch that cleans up some formatting and files the OCP test to pass.

        Show
        markrmiller@gmail.com Mark Miller added a comment - Patch that cleans up some formatting and files the OCP test to pass.
        Hide
        anshumg Anshum Gupta added a comment -

        Thanks Mark.

        Show
        anshumg Anshum Gupta added a comment - Thanks Mark.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1654942 from Anshum Gupta in branch 'dev/trunk'
        [ https://svn.apache.org/r1654942 ]

        SOLR-7038: Validate config set presence before trying to create a collection

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1654942 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1654942 ] SOLR-7038 : Validate config set presence before trying to create a collection
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1654943 from Anshum Gupta in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1654943 ]

        SOLR-7038: Validate config set presence before trying to create a collection (merge from trunk)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1654943 from Anshum Gupta in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1654943 ] SOLR-7038 : Validate config set presence before trying to create a collection (merge from trunk)
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1654946 from Anshum Gupta in branch 'dev/branches/lucene_solr_5_0'
        [ https://svn.apache.org/r1654946 ]

        SOLR-7038: Validate config set presence before trying to create a collection (merge from branch_5x)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1654946 from Anshum Gupta in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1654946 ] SOLR-7038 : Validate config set presence before trying to create a collection (merge from branch_5x)
        Hide
        anshumg Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        anshumg Anshum Gupta added a comment - Bulk close after 5.0 release.
        Hide
        anshumg Anshum Gupta added a comment -

        Reopening for backporting to 4.10.4.

        Show
        anshumg Anshum Gupta added a comment - Reopening for backporting to 4.10.4.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1662589 from Anshum Gupta in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1662589 ]

        SOLR-7038: Validate config set presence before trying to create a collection (merge from branch_5x)

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1662589 from Anshum Gupta in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662589 ] SOLR-7038 : Validate config set presence before trying to create a collection (merge from branch_5x)
        Hide
        mikemccand Michael McCandless added a comment -

        Bulk close for 4.10.4 release

        Show
        mikemccand Michael McCandless added a comment - Bulk close for 4.10.4 release

          People

          • Assignee:
            anshumg Anshum Gupta
            Reporter:
            anshumg Anshum Gupta
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development