Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6700

ChildDocTransformer doesn't return correct children after updating and optimising solr index

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Not A Bug
    • None
    • 4.10.5
    • update
    • None

    Description

      I have an index with nested documents.

      schema.xml snippet
       <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
      <field name="entityType" type="int" indexed="true" stored="true" required="true"/>
      <field name="pName" type="string" indexed="true" stored="true"/>
      <field name="cAlbum" type="string" indexed="true" stored="true"/>
      <field name="cSong" type="string" indexed="true" stored="true"/>
      <field name="_root_" type="string" indexed="true" stored="true"/>
      <field name="_version_" type="long" indexed="true" stored="true"/>
      

      Afterwards I add the following documents:

      <add>
        <doc>
          <field name="id">1</field>
          <field name="pName">Test Artist 1</field>
          <field name="entityType">1</field>
          <doc>
              <field name="id">11</field>
              <field name="cAlbum">Test Album 1</field>
      	    <field name="cSong">Test Song 1</field>
              <field name="entityType">2</field>
          </doc>
        </doc>
        <doc>
          <field name="id">2</field>
          <field name="pName">Test Artist 2</field>
          <field name="entityType">1</field>
          <doc>
              <field name="id">22</field>
              <field name="cAlbum">Test Album 2</field>
      	    <field name="cSong">Test Song 2</field>
              <field name="entityType">2</field>
          </doc>
        </doc>
      </add>
      

      After performing the following query

      http://localhost:8983/solr/collection1/select?q=%7B!parent+which%3DentityType%3A1%7D&fl=*%2Cscore%2C%5Bchild+parentFilter%3DentityType%3A1%5D&wt=json&indent=true

      I get a correct answer (child matches parent, check root field)

      add docs
      {
        "responseHeader":{
          "status":0,
          "QTime":1,
          "params":{
            "fl":"*,score,[child parentFilter=entityType:1]",
            "indent":"true",
            "q":"{!parent which=entityType:1}",
            "wt":"json"}},
        "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
            {
              "id":"1",
              "pName":"Test Artist 1",
              "entityType":1,
              "_version_":1483832661048819712,
              "_root_":"1",
              "score":1.0,
              "_childDocuments_":[
              {
                "id":"11",
                "cAlbum":"Test Album 1",
                "cSong":"Test Song 1",
                "entityType":2,
                "_root_":"1"}]},
            {
              "id":"2",
              "pName":"Test Artist 2",
              "entityType":1,
              "_version_":1483832661050916864,
              "_root_":"2",
              "score":1.0,
              "_childDocuments_":[
              {
                "id":"22",
                "cAlbum":"Test Album 2",
                "cSong":"Test Song 2",
                "entityType":2,
                "_root_":"2"}]}]
        }}
      

      Afterwards I try to update one document:

      update doc
      <add>
      <doc>
      <field name="id">1</field>
      <field name="pName" update="set">INIT</field>
      </doc>
      </add>
      

      After performing the previous query I get the right result (like the previous one but with the pName field updated).

      The problem only comes after performing an optimize.
      Now, the same query yields the following result:

      {
        "responseHeader":{
          "status":0,
          "QTime":1,
          "params":{
            "fl":"*,score,[child parentFilter=entityType:1]",
            "indent":"true",
            "q":"{!parent which=entityType:1}",
            "wt":"json"}},
        "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
            {
              "id":"2",
              "pName":"Test Artist 2",
              "entityType":1,
              "_version_":1483832661050916864,
              "_root_":"2",
              "score":1.0,
              "_childDocuments_":[
              {
                "id":"11",
                "cAlbum":"Test Album 1",
                "cSong":"Test Song 1",
                "entityType":2,
                "_root_":"1"},
              {
                "id":"22",
                "cAlbum":"Test Album 2",
                "cSong":"Test Song 2",
                "entityType":2,
                "_root_":"2"}]},
            {
              "id":"1",
              "pName":"INIT",
              "entityType":1,
              "_root_":"1",
              "_version_":1483832916867809280,
              "score":1.0}]
        }}
      

      As can be seen, the document with id:2 now contains the child with id:11 that belongs to the document with id:1.

      I haven't found any references on the web about this except http://blog.griddynamics.com/2013/09/solr-block-join-support.html
      Similar issue: SOLR-6096

      Is this problem known? Is there a workaround for this?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bogandy Bogdan Marinescu
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: