Solr
  1. Solr
  2. SOLR-2691

solr.xml persistence is broken for multicore (by SOLR-2331)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.0-ALPHA
    • Component/s: multicore
    • Labels:
      None

      Description

      With the trunk build, running SolrCloud, if I issue a PERSIST CoreAdmin command,
      the solr.xml gets overwritten with only the last core, repeated as many times
      as there are cores.

      It used to work fine with a trunk build from a couple of months ago, so it looks like
      something broke solr.xml persistence.

      It appears to have been introduced by SOLR-2331:
      CoreContainer#persistFile creates the map for core attributes (coreAttribs) outside
      of the loop that iterates over cores. Therefore, all cores reuse the same map of attributes
      and hence only the values from the last core are preserved and used for all cores in the list.

      I'm running SolrCloud, using:
      -Dbootstrap_confdir=/opt/solr/solr/conf -Dcollection.configName=hcpconf -DzkRun

      I'm starting Solr with four cores listed in solr.xml:

      <solr persistent="true">
      <cores adminPath="/admin/cores" defaultCoreName="master1">
      <core name="master1" instanceDir="master1" shard="shard1" collection="hcpconf" />
      <core name="master2" instanceDir="master2" shard="shard2" collection="hcpconf" />
      <core name="slave1" instanceDir="slave1" shard="shard1" collection="hcpconf" />
      <core name="slave2" instanceDir="slave2" shard="shard2" collection="hcpconf" />
      </cores>
      </solr>

      I then issue a PERSIST request:
      http://localhost:8983/solr/admin/cores?action=PERSIST

      And the solr.xml turns into:

      <solr persistent="true">
      <cores defaultCoreName="master1" adminPath="/admin/cores" zkClientTimeout="10000" hostPort="8983" hostContext="solr">
      <core shard="shard2" instanceDir="slave2/" name="slave2" collection="hcpconf"/>
      <core shard="shard2" instanceDir="slave2/" name="slave2" collection="hcpconf"/>
      <core shard="shard2" instanceDir="slave2/" name="slave2" collection="hcpconf"/>
      <core shard="shard2" instanceDir="slave2/" name="slave2" collection="hcpconf"/>
      </cores>
      </solr>

      1. SOLR-2691.patch
        7 kB
        Hoss Man
      2. SOLR-2691.patch
        0.7 kB
        Hoss Man
      3. jira2691.patch
        0.7 kB
        Yury Kats

        Issue Links

          Activity

          Hide
          Mark Miller added a comment -

          You dont want to close y because it's been registered and the CoreContainer will close it. You do want to close X because it has been removed from the CoreContainer.

          Show
          Mark Miller added a comment - You dont want to close y because it's been registered and the CoreContainer will close it. You do want to close X because it has been removed from the CoreContainer.
          Hide
          Mark Miller added a comment -

          weird...that should really be a test fail...we should probably track cores as we do searchers...

          Show
          Mark Miller added a comment - weird...that should really be a test fail...we should probably track cores as we do searchers...
          Hide
          Hoss Man added a comment -

          I'm still seeing this test complain that close() is being called on SolrCore too many times...

              [junit] Testsuite: org.apache.solr.core.TestCoreContainer
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.611 sec
              [junit] 
              [junit] ------------- Standard Error -----------------
              [junit] 2011.08.16. 9:59:36 org.apache.solr.core.SolrCore close
              [junit] SEVERE: Too many close [count:-1] on org.apache.solr.core.SolrCore@25a41cc7. Please report this exception to solr-user@lucene.apache.org
              [junit] ------------- ---------------- ---------------
          
          Show
          Hoss Man added a comment - I'm still seeing this test complain that close() is being called on SolrCore too many times... [junit] Testsuite: org.apache.solr.core.TestCoreContainer [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.611 sec [junit] [junit] ------------- Standard Error ----------------- [junit] 2011.08.16. 9:59:36 org.apache.solr.core.SolrCore close [junit] SEVERE: Too many close [count:-1] on org.apache.solr.core.SolrCore@25a41cc7. Please report this exception to solr-user@lucene.apache.org [junit] ------------- ---------------- ---------------
          Hide
          Yury Kats added a comment -

          Thanks, guys. Appreciate the quick turn around!

          Show
          Yury Kats added a comment - Thanks, guys. Appreciate the quick turn around!
          Hide
          Mark Miller added a comment -

          Thanks Yury and hossman!

          Show
          Mark Miller added a comment - Thanks Yury and hossman!
          Hide
          Hoss Man added a comment -

          Bah ... forgot to svn add

          Show
          Hoss Man added a comment - Bah ... forgot to svn add
          Hide
          Mark Miller added a comment -

          Thanks a lot for the catch and diagnosis Yury - hossman tends to sleep all day, but when he gets up and delivers his patch, we will get this in right away.

          Show
          Mark Miller added a comment - Thanks a lot for the catch and diagnosis Yury - hossman tends to sleep all day, but when he gets up and delivers his patch, we will get this in right away.
          Hide
          Mark Miller added a comment -

          Hey hossman, your patch seems to be missing some content

          Show
          Mark Miller added a comment - Hey hossman, your patch seems to be missing some content
          Hide
          Hoss Man added a comment -

          patch of persistence tests at the CoreContainer level (since that's where the bug was) that incorporates Yury's fix.

          the assertions could definitely be beefed up to sanity check more aspects of the serialization, and we should really also be testing that "load" works and parses all of these things back in in the expected way as well, but it's a start.

          The thing that's currently hanging me up is that somehow the test is leaking a SolrIndexSearcher reference. I thought maybe it was because of the SolrCores i was creating+registering and then ignoring, but if i try to close them i get an error about too many decrefs instead.

          I'll let miller figure it out

          Show
          Hoss Man added a comment - patch of persistence tests at the CoreContainer level (since that's where the bug was) that incorporates Yury's fix. the assertions could definitely be beefed up to sanity check more aspects of the serialization, and we should really also be testing that "load" works and parses all of these things back in in the expected way as well, but it's a start. The thing that's currently hanging me up is that somehow the test is leaking a SolrIndexSearcher reference. I thought maybe it was because of the SolrCores i was creating+registering and then ignoring, but if i try to close them i get an error about too many decrefs instead. I'll let miller figure it out
          Hide
          Yury Kats added a comment -

          Patch. Create map of attributes inside the loop.

          Show
          Yury Kats added a comment - Patch. Create map of attributes inside the loop.

            People

            • Assignee:
              Mark Miller
              Reporter:
              Yury Kats
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development