Solr
  1. Solr
  2. SOLR-3273

404 Not Found on action=PREPRECOVERY

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Invalid
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: None
    • Component/s: SolrCloud
    • Labels:
      None
    • Environment:

      Any

      Description

      We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.

      About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
      We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up:

      nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
      

      The ./myapp/solr.xml looks like this on server1:

      <?xml version="1.0" encoding="UTF-8" ?>
      <solr persistent="false">
        <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
          <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
        </cores>
      </solr>
      

      The ./myapp/solr.xml looks like this on server2:

      <?xml version="1.0" encoding="UTF-8" ?>
      <solr persistent="false">
        <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
          <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
        </cores>
      </solr>
      

      The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:

      SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found
      
      request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
              at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
              at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
              at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
              at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
              at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
              at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
      

      Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.

      Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.

      Regards, Per Steffensen

        Activity

        Mark Miller made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Invalid [ 6 ]
        Mark Miller made changes -
        Priority Major [ 3 ] Minor [ 4 ]
        Mark Miller made changes -
        Assignee Mark Miller [ markrmiller@gmail.com ]
        Per Steffensen made changes -
        Field Original Value New Value
        Description We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.

        About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
        We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up:
        <pre>
        nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
        </pre>
        The ./myapp/solr.xml looks like this on server1:
        <pre>
        <?xml version="1.0" encoding="UTF-8" ?>
        <solr persistent="false">
          <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
            <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
          </cores>
        </solr>
        </pre>
        The ./myapp/solr.xml looks like this on server2:
        <pre>
        <?xml version="1.0" encoding="UTF-8" ?>
        <solr persistent="false">
          <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
            <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
          </cores>
        </solr>
        </pre>

        The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:
        <pre>
        SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found

        request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
                at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
                at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
                at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
                at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
                at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
                at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
        </pre>

        Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.

        Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.

        Regards, Per Steffensen
        We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.

        About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
        We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up:
        {code}
        nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
        {code}
        The ./myapp/solr.xml looks like this on server1:
        {code:xml}
        <?xml version="1.0" encoding="UTF-8" ?>
        <solr persistent="false">
          <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
            <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
          </cores>
        </solr>
        {code}
        The ./myapp/solr.xml looks like this on server2:
        {code:xml}
        <?xml version="1.0" encoding="UTF-8" ?>
        <solr persistent="false">
          <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
            <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
          </cores>
        </solr>
        {code}

        The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:
        {code}
        SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found

        request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
                at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
                at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
                at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
                at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
                at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
                at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
        {code}

        Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.

        Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.

        Regards, Per Steffensen
        Per Steffensen created issue -

          People

          • Assignee:
            Mark Miller
            Reporter:
            Per Steffensen
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development