Solr
  1. Solr
  2. SOLR-4468

Add document but keep existing fields values

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: 4.1
    • Fix Version/s: 4.2
    • Component/s: update

      Description

      The original need is a field that represents the (first) insertion time of the document.

      It can be implemented as another value in the "update" optional attribute of 'field' element, in AddUpdateCommand.

      1. SOLR-4468.patch
        0.9 kB
        Isaac Hebsh

        Activity

        Hide
        Isaac Hebsh added a comment -

        Fix suggestion

        Show
        Isaac Hebsh added a comment - Fix suggestion
        Hide
        Isaac Hebsh added a comment - - edited

        example syntax:

        <add>
          <doc>
            <field name="first_update_timestamp" update="create">NOW</field>
          </doc>
        </add>
        
        Show
        Isaac Hebsh added a comment - - edited example syntax: <add> <doc> <field name= "first_update_timestamp" update= "create" > NOW </field> </doc> </add>
        Hide
        Isaac Hebsh added a comment -

        Notes:

        • It depends on the Atomic Update feature (which depends on RealTimeGet).
        • It will work on stored fields only.
        Show
        Isaac Hebsh added a comment - Notes: It depends on the Atomic Update feature (which depends on RealTimeGet). It will work on stored fields only.
        Hide
        Hoss Man added a comment -

        Isaac: thank you for your patch, a few comments...

        1) the patch would definitley need to be updated to include some tests before this feature could be considered for inclusion in Solr

        2) skimming the patch, i'm not convinced it behalves in the way you describe – in particula consider what would happen if (in your example usage) a document existed in the index which did not have any value in the "first_update_timestamp" field. from what i can tell the patch as written isn't actually doing anything to distinguish between the case of "this is the first time adding this document (so accept the field value)" and "this is the first time someone has tried to set this value" ... although perhaps i'm just missunderstanding what you're goal is?

        3) i'm not sure that the verb "create" fits with the other existing verbs that are available for atomic updates ... it seems like what we really want is something more along the lines of "setIfEmpty" or "setOnCreate" (depending on the intended behavior as mentioned above in #2)

        4) it's not clear to me from skimming the patch if this will work with multiple values, would definitely need to see a test case verifying that both values were added in a situation like...

        <doc>
          <field name="foo" update="create">ABC</field>
          <field name="foo" update="create">XYZ</field>
          ...
        </doc>
        </add>
        

        I would also like to point out that (unless i'm missunderstanding) for the initial use case that seems to have motivated this issue (ie: "timestamp when doc was first indexed", where the field must be single valued to make sense) i'm pretty sure this goal is already achievable w/o any code changes if:

        • the clients always specify update="add" on this particular field
        • FirstFieldValueUpdateProcessorFactory is configured on for this field after the DistributedUpdateProcessorFactory
        Show
        Hoss Man added a comment - Isaac: thank you for your patch, a few comments... 1) the patch would definitley need to be updated to include some tests before this feature could be considered for inclusion in Solr 2) skimming the patch, i'm not convinced it behalves in the way you describe – in particula consider what would happen if (in your example usage) a document existed in the index which did not have any value in the "first_update_timestamp" field. from what i can tell the patch as written isn't actually doing anything to distinguish between the case of "this is the first time adding this document (so accept the field value)" and "this is the first time someone has tried to set this value" ... although perhaps i'm just missunderstanding what you're goal is? 3) i'm not sure that the verb "create" fits with the other existing verbs that are available for atomic updates ... it seems like what we really want is something more along the lines of "setIfEmpty" or "setOnCreate" (depending on the intended behavior as mentioned above in #2) 4) it's not clear to me from skimming the patch if this will work with multiple values, would definitely need to see a test case verifying that both values were added in a situation like... <doc> <field name= "foo" update= "create" >ABC</field> <field name= "foo" update= "create" >XYZ</field> ... </doc> </add> I would also like to point out that (unless i'm missunderstanding) for the initial use case that seems to have motivated this issue (ie: "timestamp when doc was first indexed", where the field must be single valued to make sense) i'm pretty sure this goal is already achievable w/o any code changes if: the clients always specify update="add" on this particular field FirstFieldValueUpdateProcessorFactory is configured on for this field after the DistributedUpdateProcessorFactory
        Hide
        Isaac Hebsh added a comment -

        Wonderful! FirstFieldValueUpdateProcessorFactory is exactly the solution!

        And, thank you for the review

        Show
        Isaac Hebsh added a comment - Wonderful! FirstFieldValueUpdateProcessorFactory is exactly the solution! And, thank you for the review
        Hide
        Isaac Hebsh added a comment -

        FirstFieldValueUpdateProcessorFactory is an existing solution

        Show
        Isaac Hebsh added a comment - FirstFieldValueUpdateProcessorFactory is an existing solution

          People

          • Assignee:
            Unassigned
            Reporter:
            Isaac Hebsh
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development