Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6909

Allow pluggable atomic update merging logic

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Clients should be able to introduce their own specific merging logic by implementing a new class that will be used by the DistributedUpdateProcessor. This is particularly useful if you require a custom hook to interrogate the incoming document with the document that is already resident in the index as there isn't the ability to perform that operation nor can you currently extend the DistributedUpdateProcessor to provide the modifications.

      1. SOLR-6909.patch
        19 kB
        Steve Davids
      2. SOLR-6909.patch
        19 kB
        Steve Davids

        Activity

        Hide
        thelabdude Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        thelabdude Timothy Potter added a comment - Bulk close after 5.1 release
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Thanks Steve, I've been meaning to extract that logic for some time now...
        I've also slapped an experimental tag on the class to allow easy modification in the future.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Thanks Steve, I've been meaning to extract that logic for some time now... I've also slapped an experimental tag on the class to allow easy modification in the future.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1652670 from Yonik Seeley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1652670 ]

        SOLR-6909: Extract atomic update handling logic into AtomicUpdateDocumentMerger

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1652670 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1652670 ] SOLR-6909 : Extract atomic update handling logic into AtomicUpdateDocumentMerger
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1652660 from Yonik Seeley in branch 'dev/trunk'
        [ https://svn.apache.org/r1652660 ]

        SOLR-6909: Extract atomic update handling logic into AtomicUpdateDocumentMerger

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1652660 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1652660 ] SOLR-6909 : Extract atomic update handling logic into AtomicUpdateDocumentMerger
        Hide
        sdavids Steve Davids added a comment -

        Updated patch to add a 'doSet' and 'doAdd' method which allows clients to override specific implementations of any atomic update command.

        Show
        sdavids Steve Davids added a comment - Updated patch to add a 'doSet' and 'doAdd' method which allows clients to override specific implementations of any atomic update command.
        Hide
        sdavids Steve Davids added a comment - - edited

        The javascript approach is interesting but would seem overly complex when you always want the merging logic to work a specific way all the time. Additionally, I have a user case where I download a document in an update processor, extract fields from downloaded content, and index that document. The interesting thing here is that if I can't download the document I set the doc's status to error, though this is only valid if a good document doesn't already exists in the index, so if an error doc is trying to be merged on top of an existing document an exception is thrown and won't clobber the good document. As you can see with the approach taken in this ticket it allows you the added flexibility with a customizable AtomicUpdateDocumentMerger.

        Another added benefit is that it cleans up the DistributedUpdateProcessor a little. One modification I might want to make is to the attached patch is to make a `doSet` and `doAdd` which would be allow overrides of each specific merge type.

        Show
        sdavids Steve Davids added a comment - - edited The javascript approach is interesting but would seem overly complex when you always want the merging logic to work a specific way all the time. Additionally, I have a user case where I download a document in an update processor, extract fields from downloaded content, and index that document. The interesting thing here is that if I can't download the document I set the doc's status to error, though this is only valid if a good document doesn't already exists in the index, so if an error doc is trying to be merged on top of an existing document an exception is thrown and won't clobber the good document. As you can see with the approach taken in this ticket it allows you the added flexibility with a customizable AtomicUpdateDocumentMerger. Another added benefit is that it cleans up the DistributedUpdateProcessor a little. One modification I might want to make is to the attached patch is to make a `doSet` and `doAdd` which would be allow overrides of each specific merge type.
        Hide
        ichattopadhyaya Ishan Chattopadhyaya added a comment -

        FYI. One idea I was once looking at was to provide the ability of updating a field value using a javascript expression. SOLR-5979.
        Some code is in here, https://issues.apache.org/jira/secure/attachment/12639276/SOLR-5944.patch (the "expr" operation, along with "add" and "inc").

        Show
        ichattopadhyaya Ishan Chattopadhyaya added a comment - FYI. One idea I was once looking at was to provide the ability of updating a field value using a javascript expression. SOLR-5979 . Some code is in here, https://issues.apache.org/jira/secure/attachment/12639276/SOLR-5944.patch (the "expr" operation, along with "add" and "inc").
        Hide
        sdavids Steve Davids added a comment -

        Attached a patch which pulls the current merging implementation out from the DistributedUpdateProcessor into a new AtomicUpdateDocumentMerger class. This DistributedUpdateProcessorFactory instantiates a new AtomicUpdateDocumentMerger and passes it to the DistributedUpdateProcessor. This approach allows clients to extend the DistributedUpdateProcessorFactory and instantiate their own custom AtomicUpdateDocumentMerger which is then passed along to the DistributedUpdateProcessor. One thing that I'm not thrilled about is having a static 'isAtomicUpdate' method (currently in the code), I tried to remove the static but a couple other classes require that static method to be there and having a merger member variable didn't quite make sense in those cases so I left it a static.

        Show
        sdavids Steve Davids added a comment - Attached a patch which pulls the current merging implementation out from the DistributedUpdateProcessor into a new AtomicUpdateDocumentMerger class. This DistributedUpdateProcessorFactory instantiates a new AtomicUpdateDocumentMerger and passes it to the DistributedUpdateProcessor. This approach allows clients to extend the DistributedUpdateProcessorFactory and instantiate their own custom AtomicUpdateDocumentMerger which is then passed along to the DistributedUpdateProcessor. One thing that I'm not thrilled about is having a static 'isAtomicUpdate' method (currently in the code), I tried to remove the static but a couple other classes require that static method to be there and having a merger member variable didn't quite make sense in those cases so I left it a static.

          People

          • Assignee:
            Unassigned
            Reporter:
            sdavids Steve Davids
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development