Solr
  1. Solr
  2. SOLR-8276

Atomic updates & RTG don't work with non-stored docvalues

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.10.4, 5.4
    • Fix Version/s: 5.5, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently, for atomic updates, the non-stored docvalues fields are neither (a) carried forward to updated document, nor (b) do operations like "inc" work on them. Also, RTG of documents containing such fields doesn't return those fields if the document is fetched from the index.

      1. SOLR-8276.patch
        5 kB
        Ishan Chattopadhyaya
      2. SOLR-8276.patch
        5 kB
        Ishan Chattopadhyaya
      3. SOLR-8276.patch
        4 kB
        Ishan Chattopadhyaya
      4. SOLR-8276.patch
        6 kB
        Ishan Chattopadhyaya
      5. SOLR-8276.patch
        7 kB
        Ishan Chattopadhyaya

        Issue Links

          Activity

          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Adding a patch that populates the non-stored, non-multivalued docvalues fields during the atomic updates.

          I couldn't test for multivalued non-stored docvalues fields, since I got the following exception during the field() function query on a multivalued field: can not use FieldCache on multivalued field: intdvMulti. Am I missing something obvious?

          I think doing this for non-multivalued fields itself is an improvement, and that the case of multivalued fields can be dealt with separately.

          Show
          Ishan Chattopadhyaya added a comment - - edited Adding a patch that populates the non-stored, non-multivalued docvalues fields during the atomic updates. I couldn't test for multivalued non-stored docvalues fields, since I got the following exception during the field() function query on a multivalued field: can not use FieldCache on multivalued field: intdvMulti . Am I missing something obvious? I think doing this for non-multivalued fields itself is an improvement, and that the case of multivalued fields can be dealt with separately.
          Hide
          Ishan Chattopadhyaya added a comment -

          Better patch updated.

          Show
          Ishan Chattopadhyaya added a comment - Better patch updated.
          Hide
          Yonik Seeley added a comment -

          I couldn't test for multivalued non-stored docvalues fields, since I got the following exception during the field() function query on a multivalued field: can not use FieldCache on multivalued field: intdvMulti. Am I missing something obvious?

          Since function queries don't support multi-valued fields, we should go through docValues API instead?

          Show
          Yonik Seeley added a comment - I couldn't test for multivalued non-stored docvalues fields, since I got the following exception during the field() function query on a multivalued field: can not use FieldCache on multivalued field: intdvMulti. Am I missing something obvious? Since function queries don't support multi-valued fields, we should go through docValues API instead?
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          I was kind of hoping not to redo all the low level conversions like Float.intBitsToFloat((int)arr.get(doc)) (this example from FloatFieldSource) all over again and hence was hoping to use the functions and get away without doing it. However, since multivalued docValues aren't accessible that way, I have three choices:

          1. Do the single valued fields using the function queries and the multivalued fields using the docValues API (will also require this low level conversions for non long docvalues). Or,
          2. Do both singly and multi valued fields using docValues API and do the low level conversions for both. Or,
          3. Do single valued fields using function queries, and extend functions queries to support multivalued docvalues and use it.

          Yonik Seeley Any preference? Right now, I'm thinking of going with 1 now, and when/if function queries can be made to support multivalued fields later, then switch to 3. Does that sound good? (I am fine going option 2 route as well). Also, are there any performance implications I am overlooking when using value sources as opposed to docvalues API directly?

          Show
          Ishan Chattopadhyaya added a comment - - edited I was kind of hoping not to redo all the low level conversions like Float.intBitsToFloat((int)arr.get(doc)) (this example from FloatFieldSource ) all over again and hence was hoping to use the functions and get away without doing it. However, since multivalued docValues aren't accessible that way, I have three choices: Do the single valued fields using the function queries and the multivalued fields using the docValues API (will also require this low level conversions for non long docvalues). Or, Do both singly and multi valued fields using docValues API and do the low level conversions for both. Or, Do single valued fields using function queries, and extend functions queries to support multivalued docvalues and use it. Yonik Seeley Any preference? Right now, I'm thinking of going with 1 now, and when/if function queries can be made to support multivalued fields later, then switch to 3. Does that sound good? (I am fine going option 2 route as well). Also, are there any performance implications I am overlooking when using value sources as opposed to docvalues API directly?
          Hide
          Ishan Chattopadhyaya added a comment -

          Maybe however way we do it here, we might want to do it the same way for SOLR-8220?

          Show
          Ishan Chattopadhyaya added a comment - Maybe however way we do it here, we might want to do it the same way for SOLR-8220 ?
          Hide
          Yonik Seeley added a comment -

          I think doing it whatever way is easiest for now is fine... it's implementation, not interface.

          Show
          Yonik Seeley added a comment - I think doing it whatever way is easiest for now is fine... it's implementation, not interface.
          Hide
          Ishan Chattopadhyaya added a comment -

          Moved over code from here to SOLR-8220's patch. Now this is as simple as calling the right method in the SolrIndexSearcher.

          Updated the patch here, it still contains the test.

          Show
          Ishan Chattopadhyaya added a comment - Moved over code from here to SOLR-8220 's patch. Now this is as simple as calling the right method in the SolrIndexSearcher. Updated the patch here, it still contains the test.
          Hide
          Ishan Chattopadhyaya added a comment -

          Adding a test for single valued non-stored docValues field, to ensure it is carried forward during atomic updates.

          Show
          Ishan Chattopadhyaya added a comment - Adding a test for single valued non-stored docValues field, to ensure it is carried forward during atomic updates.
          Hide
          Ishan Chattopadhyaya added a comment - - edited

          Updating the patch with latest changes introduced in SOLR-8220.
          TODO: Add a test for pure RTG for docs with non-stored DV fields.

          Shalin Shekhar Mangar, Yonik Seeley Can you please review this? Thanks.

          Show
          Ishan Chattopadhyaya added a comment - - edited Updating the patch with latest changes introduced in SOLR-8220 . TODO: Add a test for pure RTG for docs with non-stored DV fields. Shalin Shekhar Mangar , Yonik Seeley Can you please review this? Thanks.
          Hide
          Shalin Shekhar Mangar added a comment -

          +1 LGTM

          Show
          Shalin Shekhar Mangar added a comment - +1 LGTM
          Hide
          ASF subversion and git services added a comment -

          Commit 1722009 from shalin@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1722009 ]

          SOLR-8276: Atomic updates and realtime-get do not work with non-stored docvalues

          Show
          ASF subversion and git services added a comment - Commit 1722009 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1722009 ] SOLR-8276 : Atomic updates and realtime-get do not work with non-stored docvalues
          Hide
          ASF subversion and git services added a comment -

          Commit 1722011 from shalin@apache.org in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1722011 ]

          SOLR-8276: Atomic updates and realtime-get do not work with non-stored docvalues

          Show
          ASF subversion and git services added a comment - Commit 1722011 from shalin@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1722011 ] SOLR-8276 : Atomic updates and realtime-get do not work with non-stored docvalues
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Yonik and Ishan!

          Show
          Shalin Shekhar Mangar added a comment - Thanks Yonik and Ishan!

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Ishan Chattopadhyaya
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development