Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-14971

AtomicUpdate 'remove' fails on 'pints' fields

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 8.5.2
    • 8.8, 9.0
    • update
    • None

    Description

      The "remove" atomic update action on multivalued int fields fails if the document being changed is uncommitted.

      At first glance this appears to be a type-related issue. AtomicUpdateDocumentMerger attempts to handle multivalued int fields by processing the List<Integer> type, but in uncommitted docs int fields are still List<String> in the tlog. Conceptually this feels similar to SOLR-13331.

      It's likely this issue also affects other numeric and date fields.

      Attached is a simple script to reproduce, meant to be run from the root of a Solr install.

      Attachments

        1. reproduce.sh
          2 kB
          Jason Gerlowski

        Issue Links

          Activity

            It turns out this is a type problem, but not the one I guessed above. The "existing document" values aren't Strings - they're Longs!

            The code in question is the removeObj method here:

              private void removeObj(@SuppressWarnings({"rawtypes"})Collection original, Object toRemove, String fieldName) {
                if(isChildDoc(toRemove)) {
                  removeChildDoc(original, (SolrInputDocument) toRemove);
                } else {
                  original.remove(getNativeFieldValue(fieldName, toRemove));
                }
              }
            

            When removing an int value, getNativeFieldValue is consistent in always returning an integer. But the 'original' Collection of existing values has different types depending on whether it was retrieved from the update log (longs) or from the index (ints).

            Java's Number classes declare equals() in such a way that a Long is never equal to an Integer, even when the two represent the same numeric quantity. So a type mismatch causes the remove attempt to fail when the existing doc is retrieved from the tlog.

            This cause has one upside - while it's like to affect int/long and double/float, it seems specific to numerics and doesn't translate to other types.

            There's a couple ways we could address this:

            1. Make sure the tlog's SolrInputDocument has int values instead of longs where appropriate.
            2. Special-case numeric values in AtomicUpdateDocumentMerger.removeObj and add custom removal logic that handles the inconsistency in our input types.

            Conceptually I like (1) much better - it'd be nice if the atomic-update code (and anything else that pulls RTG values) didn't have to handle a bunch of arbitrary variations in its input. But this is could be a big change and might run afoul of whatever legitimate reasons there might be for storing the values as Longs in the UpdateLog.

            Anyway, I'll probably go with the admittedly-hackier custom logic approach, despite not liking it. If anyone sees a better way forward though, please let me know.

            gerlowskija Jason Gerlowski added a comment - It turns out this is a type problem, but not the one I guessed above. The "existing document" values aren't Strings - they're Longs! The code in question is the removeObj method here: private void removeObj(@SuppressWarnings({ "rawtypes" })Collection original, Object toRemove, String fieldName) { if (isChildDoc(toRemove)) { removeChildDoc(original, (SolrInputDocument) toRemove); } else { original.remove(getNativeFieldValue(fieldName, toRemove)); } } When removing an int value, getNativeFieldValue is consistent in always returning an integer. But the 'original' Collection of existing values has different types depending on whether it was retrieved from the update log (longs) or from the index (ints). Java's Number classes declare equals() in such a way that a Long is never equal to an Integer, even when the two represent the same numeric quantity. So a type mismatch causes the remove attempt to fail when the existing doc is retrieved from the tlog. This cause has one upside - while it's like to affect int/long and double/float, it seems specific to numerics and doesn't translate to other types. There's a couple ways we could address this: Make sure the tlog's SolrInputDocument has int values instead of longs where appropriate. Special-case numeric values in AtomicUpdateDocumentMerger.removeObj and add custom removal logic that handles the inconsistency in our input types. Conceptually I like (1) much better - it'd be nice if the atomic-update code (and anything else that pulls RTG values) didn't have to handle a bunch of arbitrary variations in its input. But this is could be a big change and might run afoul of whatever legitimate reasons there might be for storing the values as Longs in the UpdateLog. Anyway, I'll probably go with the admittedly-hackier custom logic approach, despite not liking it. If anyone sees a better way forward though, please let me know.

            Commit a7197ac0ce8333ce7019f49c6fab690a96ff7d77 in lucene-solr's branch refs/heads/master from Jason Gerlowski
            [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a7197ac ]

            SOLR-14971: Handle atomic-removes on uncommitted docs (#2056)

            Docs fetched from the update log via RTG look different than docs
            fetched from commits in the index: the types of
            field-values may be different between the two, etc.

            This is a problem for atomic add/remove of field values, where matching
            existing values has historically been done by object equals() calls (via
            Collection operations). This relies on equality checks which don't have
            flexible enough semantics to match values across these different types.
            (For example, `new Long(1).equals(new Integer(1))` returns `false`).
            This was causing some add-distinct and remove operations on
            uncommitted values to silently fail to remove field values.

            This commit patches over this by converting between types in the more
            common cases before using the fallback behavior.

            jira-bot ASF subversion and git services added a comment - Commit a7197ac0ce8333ce7019f49c6fab690a96ff7d77 in lucene-solr's branch refs/heads/master from Jason Gerlowski [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a7197ac ] SOLR-14971 : Handle atomic-removes on uncommitted docs (#2056) Docs fetched from the update log via RTG look different than docs fetched from commits in the index: the types of field-values may be different between the two, etc. This is a problem for atomic add/remove of field values, where matching existing values has historically been done by object equals() calls (via Collection operations). This relies on equality checks which don't have flexible enough semantics to match values across these different types. (For example, `new Long(1).equals(new Integer(1))` returns `false`). This was causing some add-distinct and remove operations on uncommitted values to silently fail to remove field values. This commit patches over this by converting between types in the more common cases before using the fallback behavior.

            Commit 8e9db02530d42a45c9ef89d93ea37f340c9a426c in lucene-solr's branch refs/heads/branch_8x from Jason Gerlowski
            [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8e9db02 ]

            SOLR-14971: Handle atomic-removes on uncommitted docs (#2056)

            Docs fetched from the update log via RTG look different than docs
            fetched from commits in the index: the types of
            field-values may be different between the two, etc.

            This is a problem for atomic add/remove of field values, where matching
            existing values has historically been done by object equals() calls (via
            Collection operations). This relies on equality checks which don't have
            flexible enough semantics to match values across these different types.
            (For example, `new Long(1).equals(new Integer(1))` returns `false`).
            This was causing some add-distinct and remove operations on
            uncommitted values to silently fail to remove field values.

            This commit patches over this by converting between types in the more
            common cases before using the fallback behavior.

            jira-bot ASF subversion and git services added a comment - Commit 8e9db02530d42a45c9ef89d93ea37f340c9a426c in lucene-solr's branch refs/heads/branch_8x from Jason Gerlowski [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8e9db02 ] SOLR-14971 : Handle atomic-removes on uncommitted docs (#2056) Docs fetched from the update log via RTG look different than docs fetched from commits in the index: the types of field-values may be different between the two, etc. This is a problem for atomic add/remove of field values, where matching existing values has historically been done by object equals() calls (via Collection operations). This relies on equality checks which don't have flexible enough semantics to match values across these different types. (For example, `new Long(1).equals(new Integer(1))` returns `false`). This was causing some add-distinct and remove operations on uncommitted values to silently fail to remove field values. This commit patches over this by converting between types in the more common cases before using the fallback behavior.

            On the github PR, Munendra pointed out that this downside also affects the 'add-distinct' operation, so the commits above fix that too.

            Munendra proposed an alternative approach to what I ended up going with that might be worth looking into in the future. He suggested having the 'original' collection of field values iterated over and converted using 'toNativeType'. This would ensure that the needle and haystack are all of the same Java type so to speak and would save us a lot of manual type checking. I liked this better conceptually but wasn't confident enough about what effect changing the type of ALL the uncommitted field values might have downstream to pursue it in this PR. If other cases are found though the ugliness of the type checking might make this worth a second look though.

            Closing this out.

            gerlowskija Jason Gerlowski added a comment - On the github PR, Munendra pointed out that this downside also affects the 'add-distinct' operation, so the commits above fix that too. Munendra proposed an alternative approach to what I ended up going with that might be worth looking into in the future. He suggested having the 'original' collection of field values iterated over and converted using 'toNativeType'. This would ensure that the needle and haystack are all of the same Java type so to speak and would save us a lot of manual type checking. I liked this better conceptually but wasn't confident enough about what effect changing the type of ALL the uncommitted field values might have downstream to pursue it in this PR. If other cases are found though the ugliness of the type checking might make this worth a second look though. Closing this out.
            janhoy Jan Høydahl added a comment -

            Closing after the 9.0.0 release

            janhoy Jan Høydahl added a comment - Closing after the 9.0.0 release

            People

              gerlowskija Jason Gerlowski
              gerlowskija Jason Gerlowski
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h