Solr
  1. Solr
  2. SOLR-3041

Solrs using SolrCloud feature for having shared config in ZK, might not all start successfully when started for the first time simultaneously

    Details

      Description

      Starting Solr like this
      java -DzkHost=<ZKs> -Dbootstrap_confdir=./myproject/conf -Dcollection.configName=myproject_conf -Dsolr.solr.home=./myproject -jar start.jar

      When not already there (starting solr for the first time) the content of ./myproject/conf will be copied by Solr into ZK. That process does not work very well in parallel, so if the content is not there and I start several Solrs simultaneously, one or more of them might not start successfully.

      I see exceptions like the ones shown below, and the Solrs throwing them will not work correctly afterwards.

      I know that there could be different workarounds, like making sure to always start one Solr and wait for a while before starting the rest of them, but I think we should really be more robuste in these cases.

      Regards, Per Steffensen
      ---- exception example 1 (the znode causing the problem can be different than /configs/myproject_conf/protwords.txt) ----
      org.apache.solr.common.cloud.ZooKeeperException:
      at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:193)
      at org.apache.solr.core.CoreContainer.load(CoreContainer.java:337)
      at org.apache.solr.core.CoreContainer.load(CoreContainer.java:294)
      at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240)
      at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93)
      at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
      at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
      at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
      at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
      at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
      at org.mortbay.jetty.Server.doStart(Server.java:224)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.mortbay.start.Main.invokeMain(Main.java:194)
      at org.mortbay.start.Main.start(Main.java:534)
      at org.mortbay.start.Main.start(Main.java:441)
      at org.mortbay.start.Main.main(Main.java:119)
      Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /configs/myproject_conf/protwords.txt
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:110)
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
      at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
      at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:347)
      at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:308)
      at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:290)
      at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:255)
      at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:384)
      at org.apache.solr.common.cloud.SolrZkClient.setData(SolrZkClient.java:410)
      at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:520)
      at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:536)
      at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:175)
      ... 29 more
      ---- exception example 2 ----
      org.apache.solr.common.cloud.ZooKeeperException:
      at org.apache.solr.core.CoreContainer.register(CoreContainer.java:526)
      at org.apache.solr.core.CoreContainer.load(CoreContainer.java:410)
      at org.apache.solr.core.CoreContainer.load(CoreContainer.java:294)
      at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240)
      at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93)
      at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
      at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
      at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
      at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
      at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
      at org.mortbay.jetty.Server.doStart(Server.java:224)
      at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
      at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.mortbay.start.Main.invokeMain(Main.java:194)
      at org.mortbay.start.Main.start(Main.java:534)
      at org.mortbay.start.Main.start(Main.java:441)
      at org.mortbay.start.Main.main(Main.java:119)
      Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/myproject_collection/shards/shard1/192.168.98.1:8983_myproject_collection_shard1
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
      at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
      at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
      at org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:238)
      at org.apache.solr.cloud.ZkController.register(ZkController.java:483)
      at org.apache.solr.core.CoreContainer.register(CoreContainer.java:517)
      ... 29 more

        Activity

        Hide
        Hoss Man added a comment -

        Could you please asses & triage this for 4.0?

        Show
        Hoss Man added a comment - Could you please asses & triage this for 4.0?
        Hide
        Hoss Man added a comment -

        bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

        Show
        Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
        Hide
        Robert Muir added a comment -

        rmuir20120906-bulk-40-change

        Show
        Robert Muir added a comment - rmuir20120906-bulk-40-change
        Hide
        Robert Muir added a comment -

        moving all 4.0 issues not touched in a month to 4.1

        Show
        Robert Muir added a comment - moving all 4.0 issues not touched in a month to 4.1
        Hide
        Mark Miller added a comment -

        The overall issue is pretty much solved by the new zkCli tool I think - you can just use that to upload your config before starting up.

        In terms of being more robust when not using that tool, I guess that is still something to consider here.

        Show
        Mark Miller added a comment - The overall issue is pretty much solved by the new zkCli tool I think - you can just use that to upload your config before starting up. In terms of being more robust when not using that tool, I guess that is still something to consider here.

          People

          • Assignee:
            Unassigned
            Reporter:
            Per Steffensen
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 96h
              96h
              Remaining:
              Remaining Estimate - 96h
              96h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development