Solr
  1. Solr
  2. SOLR-6909

Allow pluggable atomic update merging logic

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Clients should be able to introduce their own specific merging logic by implementing a new class that will be used by the DistributedUpdateProcessor. This is particularly useful if you require a custom hook to interrogate the incoming document with the document that is already resident in the index as there isn't the ability to perform that operation nor can you currently extend the DistributedUpdateProcessor to provide the modifications.

      1. SOLR-6909.patch
        19 kB
        Steve Davids
      2. SOLR-6909.patch
        19 kB
        Steve Davids

        Activity

        Hide
        Steve Davids added a comment -

        Attached a patch which pulls the current merging implementation out from the DistributedUpdateProcessor into a new AtomicUpdateDocumentMerger class. This DistributedUpdateProcessorFactory instantiates a new AtomicUpdateDocumentMerger and passes it to the DistributedUpdateProcessor. This approach allows clients to extend the DistributedUpdateProcessorFactory and instantiate their own custom AtomicUpdateDocumentMerger which is then passed along to the DistributedUpdateProcessor. One thing that I'm not thrilled about is having a static 'isAtomicUpdate' method (currently in the code), I tried to remove the static but a couple other classes require that static method to be there and having a merger member variable didn't quite make sense in those cases so I left it a static.

        Show
        Steve Davids added a comment - Attached a patch which pulls the current merging implementation out from the DistributedUpdateProcessor into a new AtomicUpdateDocumentMerger class. This DistributedUpdateProcessorFactory instantiates a new AtomicUpdateDocumentMerger and passes it to the DistributedUpdateProcessor. This approach allows clients to extend the DistributedUpdateProcessorFactory and instantiate their own custom AtomicUpdateDocumentMerger which is then passed along to the DistributedUpdateProcessor. One thing that I'm not thrilled about is having a static 'isAtomicUpdate' method (currently in the code), I tried to remove the static but a couple other classes require that static method to be there and having a merger member variable didn't quite make sense in those cases so I left it a static.
        Hide
        Ishan Chattopadhyaya added a comment -

        FYI. One idea I was once looking at was to provide the ability of updating a field value using a javascript expression. SOLR-5979.
        Some code is in here, https://issues.apache.org/jira/secure/attachment/12639276/SOLR-5944.patch (the "expr" operation, along with "add" and "inc").

        Show
        Ishan Chattopadhyaya added a comment - FYI. One idea I was once looking at was to provide the ability of updating a field value using a javascript expression. SOLR-5979 . Some code is in here, https://issues.apache.org/jira/secure/attachment/12639276/SOLR-5944.patch (the "expr" operation, along with "add" and "inc").
        Hide
        Steve Davids added a comment - - edited

        The javascript approach is interesting but would seem overly complex when you always want the merging logic to work a specific way all the time. Additionally, I have a user case where I download a document in an update processor, extract fields from downloaded content, and index that document. The interesting thing here is that if I can't download the document I set the doc's status to error, though this is only valid if a good document doesn't already exists in the index, so if an error doc is trying to be merged on top of an existing document an exception is thrown and won't clobber the good document. As you can see with the approach taken in this ticket it allows you the added flexibility with a customizable AtomicUpdateDocumentMerger.

        Another added benefit is that it cleans up the DistributedUpdateProcessor a little. One modification I might want to make is to the attached patch is to make a `doSet` and `doAdd` which would be allow overrides of each specific merge type.

        Show
        Steve Davids added a comment - - edited The javascript approach is interesting but would seem overly complex when you always want the merging logic to work a specific way all the time. Additionally, I have a user case where I download a document in an update processor, extract fields from downloaded content, and index that document. The interesting thing here is that if I can't download the document I set the doc's status to error, though this is only valid if a good document doesn't already exists in the index, so if an error doc is trying to be merged on top of an existing document an exception is thrown and won't clobber the good document. As you can see with the approach taken in this ticket it allows you the added flexibility with a customizable AtomicUpdateDocumentMerger. Another added benefit is that it cleans up the DistributedUpdateProcessor a little. One modification I might want to make is to the attached patch is to make a `doSet` and `doAdd` which would be allow overrides of each specific merge type.
        Hide
        Steve Davids added a comment -

        Updated patch to add a 'doSet' and 'doAdd' method which allows clients to override specific implementations of any atomic update command.

        Show
        Steve Davids added a comment - Updated patch to add a 'doSet' and 'doAdd' method which allows clients to override specific implementations of any atomic update command.
        Hide
        ASF subversion and git services added a comment -

        Commit 1652660 from Yonik Seeley in branch 'dev/trunk'
        [ https://svn.apache.org/r1652660 ]

        SOLR-6909: Extract atomic update handling logic into AtomicUpdateDocumentMerger

        Show
        ASF subversion and git services added a comment - Commit 1652660 from Yonik Seeley in branch 'dev/trunk' [ https://svn.apache.org/r1652660 ] SOLR-6909 : Extract atomic update handling logic into AtomicUpdateDocumentMerger
        Hide
        ASF subversion and git services added a comment -

        Commit 1652670 from Yonik Seeley in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1652670 ]

        SOLR-6909: Extract atomic update handling logic into AtomicUpdateDocumentMerger

        Show
        ASF subversion and git services added a comment - Commit 1652670 from Yonik Seeley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1652670 ] SOLR-6909 : Extract atomic update handling logic into AtomicUpdateDocumentMerger
        Hide
        Yonik Seeley added a comment -

        Thanks Steve, I've been meaning to extract that logic for some time now...
        I've also slapped an experimental tag on the class to allow easy modification in the future.

        Show
        Yonik Seeley added a comment - Thanks Steve, I've been meaning to extract that logic for some time now... I've also slapped an experimental tag on the class to allow easy modification in the future.
        Hide
        Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            Unassigned
            Reporter:
            Steve Davids
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development