Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-16160

UpdateXmlMessages duplicate data when data is removed and then added in the same message

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 9.0, 8.11.1, 9.1
    • main (10.0), 9.2
    • search, update
    • None

    Description

      Replication Steps

      1. Have two multi-value fields with the following schema 

      <field name="docTags" type="plongs" multiValued="true" indexed="true" stored="true"/><field name="tg0001" type="ipro_strings" multiValued="true" indexed="true" stored="true"/>
      
      <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
      <fieldType name="ipro_strings" class="solr.TextField" sortMissingLast="true" multiValued="true">
      <analyzer>
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      </fieldType> 
      
      

      2. Execute the following UpdateXmlMessage

      <add commitWithin="1000">
      <doc>
      <field name="_id">1</field>
      <field name="docTags" update="remove"><![CDATA[1]]></field>
      <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
      <field name="docTags" update="remove"><![CDATA[4]]></field>
      <field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
      <field name="docTags" update="remove"><![CDATA[6]]></field>
      <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
      </doc>
      </add>
      <add commitWithin="1000">
      <doc>
      <field name="_id">1</field>
      <field name="docTags" update="remove"><![CDATA[6]]></field>
      <field name="tg0001" update="remove"><![CDATA[To Bulk-Print]]></field>
      <field name="docTags" update="add-distinct"><![CDATA[1]]></field>
      <field name="tg0001" update="add-distinct"><![CDATA[Convert to Image]]></field>
      <field name="docTags" update="add-distinct"><![CDATA[4]]></field>
      <field name="tg0001" update="add-distinct"><![CDATA[Large Files]]></field>
      </doc>
      </add>
      <add commitWithin="1000">
      <doc>
      <field name="_id">1</field>
      <field name="docTags" update="remove"><![CDATA[1]]></field>
      <field name="tg0001" update="remove"><![CDATA[Convert to Image]]></field>
      <field name="docTags" update="remove"><![CDATA[4]]></field>
      <field name="tg0001" update="remove"><![CDATA[Large Files]]></field>
      <field name="docTags" update="add-distinct"><![CDATA[6]]></field>
      <field name="tg0001" update="add-distinct"><![CDATA[To Bulk-Print]]></field>
      </doc>
      </add> 

      3. Observe the following defect of duplicate values in those fields for that document

      Note: If you add the data first in the Xml message and the update="remove" tags at the bottom, it works as expected and only adds once instance of the data from the above update="add-distinct" message. The issue only occurs if the remove tags come before the add-distinct tags.

       

      Is this because of some undocumented order the updates need to be in or is it a true defect that it is not working as expected?

      Attachments

        1. image-2022-04-20-10-34-08-573.png
          81 kB
          Nick Hadder
        2. image-2022-04-20-10-35-05-247.png
          63 kB
          Nick Hadder

        Issue Links

          Activity

            People

              krisden Kevin Risden
              nhadder Nick Hadder
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m