Solr
  1. Solr
  2. SOLR-6307

Atomic update remove does not work for int array or date array

    Details

      Description

      Try to remove an element in the string array with curl:

      curl http://localhost:8080/update\?commit\=true -H 'Content-type:application/json' -d '[{ "attr_birth_year_is": { "remove": [1960]},  "id": 1098}]'
      
      curl http://localhost:8080/update\?commit\=true -H 'Content-type:application/json' -d '[{"reserved_on_dates_dts": {"remove": ["2014-02-12T12:00:00Z", "2014-07-16T12:00:00Z", "2014-02-15T12:00:00Z", "2014-02-21T12:00:00Z"]}, "id": 1098}]'
      

      Neither of them works.

      The set and add operation for int array works.
      The set, remove, and add operation for string array works

      1. SOLR-6307.patch
        1 kB
        Anurag Sharma
      2. SOLR-6307.patch
        35 kB
        Noble Paul
      3. SOLR-6307.patch
        35 kB
        Anurag Sharma
      4. SOLR-6307.patch
        37 kB
        Anurag Sharma
      5. unitTests-6307.txt
        86 kB
        Anurag Sharma

        Activity

        Hide
        Kun Xi added a comment - - edited

        Related to SOLR-3862.

        Show
        Kun Xi added a comment - - edited Related to SOLR-3862 .
        Hide
        Anurag Sharma added a comment -

        Hi Kun Xi,

        Can you provide more positive and negative test cases covering actual and expected results on this issue.
        I am working on fixing the issue.

        Thanks
        Anurag

        Show
        Anurag Sharma added a comment - Hi Kun Xi, Can you provide more positive and negative test cases covering actual and expected results on this issue. I am working on fixing the issue. Thanks Anurag
        Hide
        Kun Xi added a comment - - edited

        Anurag Sharma

        Here is how I reproduce the bug:

        1. create a document schema with two fields

        • birth_year_is: multivalue int field
        • reservation_dts: multivalue datetime field

        2. create a document with dummy data:

        • birth_year_is: [ 1960, 1970 ]
        • reservation_dts: ["2014-02-12T12:00:00Z", "2014-07-16T12:00:00Z"]

        3. try to remove 1970 from birth_year_is:

        curl http://localhost:8080/update\?commit\=true -H 'Content-type:application/json' -d '[{ "birth_year_is": { "remove": [1970]},  "id": 1}]'
        

        4. try to remove 2014-07-16T12:00:00Z from reservation_dts:

        curl http://localhost:8080/update\?commit\=true -H 'Content-type:application/json' -d '[{ "reservation_dts": { "remove": ["2014-07-16T12:00:00Z"]},  "id": 1}]'
        

        5. go to solr console and verify the two fields are NOT updated.

        Show
        Kun Xi added a comment - - edited Anurag Sharma Here is how I reproduce the bug: 1. create a document schema with two fields birth_year_is: multivalue int field reservation_dts: multivalue datetime field 2. create a document with dummy data: birth_year_is: [ 1960, 1970 ] reservation_dts: ["2014-02-12T12:00:00Z", "2014-07-16T12:00:00Z"] 3. try to remove 1970 from birth_year_is: curl http: //localhost:8080/update\?commit\= true -H 'Content-type:application/json' -d '[{ "birth_year_is" : { "remove" : [1970]}, "id" : 1}]' 4. try to remove 2014-07-16T12:00:00Z from reservation_dts: curl http: //localhost:8080/update\?commit\= true -H 'Content-type:application/json' -d '[{ "reservation_dts" : { "remove" : [ "2014-07-16T12:00:00Z" ]}, "id" : 1}]' 5. go to solr console and verify the two fields are NOT updated.
        Hide
        Anurag Sharma added a comment -

        Kun,
        Thanks a lot for the update.

        The atomic removal is not working due to the difference in types of document schema i.e. for birth_year_is has multivalue Integer field but removal is called with Long for value [1970].

        I am working on creating a patch

        Show
        Anurag Sharma added a comment - Kun, Thanks a lot for the update. The atomic removal is not working due to the difference in types of document schema i.e. for birth_year_is has multivalue Integer field but removal is called with Long for value [1970] . I am working on creating a patch
        Hide
        Anurag Sharma added a comment -

        There are three approaches to fix this issue:
        1. Enhance the org.noggit.JSONParser to support Integer and Date types.
        2. Create the input collection type based on document id and it's lucene schema.
        3. During remove typecast the input collection to original based on key for Integer and Date.

        Please suggest which approach can be taken. To me approach #2 looks generic.

        Show
        Anurag Sharma added a comment - There are three approaches to fix this issue: 1. Enhance the org.noggit.JSONParser to support Integer and Date types. 2. Create the input collection type based on document id and it's lucene schema. 3. During remove typecast the input collection to original based on key for Integer and Date. Please suggest which approach can be taken. To me approach #2 looks generic.
        Hide
        Noble Paul added a comment -

        I didn't quite understand #3

        #2 looks reasonable to me

        Show
        Noble Paul added a comment - I didn't quite understand #3 #2 looks reasonable to me
        Hide
        Anurag Sharma added a comment - - edited

        Here is the code snippet of remove function. fieldValue is an Object containing items to be removed.

        final Collection<Object> original = existingField.getValues();
        if (fieldVal instanceof Collection) {
        original.removeAll((Collection) fieldVal);
        } else {
        original.remove(fieldVal);
        }

        The type of fieldVal is parsed by org.noggit.JSONParser.

        To describe more on #3, "original.removeAll((Collection) fieldVal);" is only successful when collection items of original and fieldVal are of the same type. So before making above call another collection can be be build at runtime containing same item type as the item types of original by typecasting the value(s) from fieldVal and pass them like: original.removeAll(typeCastedCollectionOfInputItems);
        This approach has minimal code change and will impact only doRemove() function in DistributedUpdateProcessor class.

        Show
        Anurag Sharma added a comment - - edited Here is the code snippet of remove function. fieldValue is an Object containing items to be removed. final Collection<Object> original = existingField.getValues(); if (fieldVal instanceof Collection) { original.removeAll((Collection) fieldVal); } else { original.remove(fieldVal); } The type of fieldVal is parsed by org.noggit.JSONParser. To describe more on #3, "original.removeAll((Collection) fieldVal);" is only successful when collection items of original and fieldVal are of the same type. So before making above call another collection can be be build at runtime containing same item type as the item types of original by typecasting the value(s) from fieldVal and pass them like: original.removeAll(typeCastedCollectionOfInputItems); This approach has minimal code change and will impact only doRemove() function in DistributedUpdateProcessor class.
        Hide
        Noble Paul added a comment -

        I still feel #2 is the right way to do it

        Show
        Noble Paul added a comment - I still feel #2 is the right way to do it
        Hide
        Anurag Sharma added a comment -

        Here is the patch for review using approach#2.

        Other than Int and Date also have covered the case for float.

        Show
        Anurag Sharma added a comment - Here is the patch for review using approach#2. Other than Int and Date also have covered the case for float.
        Hide
        Noble Paul added a comment -

        have you given the final patch ? I can review and commit this

        Show
        Noble Paul added a comment - have you given the final patch ? I can review and commit this
        Hide
        Anurag Sharma added a comment -

        Yes. Sure, go ahead!
        Should I send the review using RBT or the attached patch file will work?

        Show
        Anurag Sharma added a comment - Yes. Sure, go ahead! Should I send the review using RBT or the attached patch file will work?
        Hide
        Noble Paul added a comment -

        does the patch apply on trunk. It does not seem to be

        Show
        Noble Paul added a comment - does the patch apply on trunk. It does not seem to be
        Hide
        Anurag Sharma added a comment -

        My local copy was quite old, updated it with the HEAD revision in trunk. Attaching again the patch after update.

        Show
        Anurag Sharma added a comment - My local copy was quite old, updated it with the HEAD revision in trunk. Attaching again the patch after update.
        Hide
        Anurag Sharma added a comment -

        I've added few more comments in the code and cleaned up commented code and non-modified file from the patch.
        Clean build and unit test run with latest updated code is still pending. I'll update another patch in case find any issue(s).

        Show
        Anurag Sharma added a comment - I've added few more comments in the code and cleaned up commented code and non-modified file from the patch. Clean build and unit test run with latest updated code is still pending. I'll update another patch in case find any issue(s).
        Hide
        Anurag Sharma added a comment -

        Am able to successfully execute the existing and newly added unit tests in the AtomicUpdatesTest. Log attached.

        Show
        Anurag Sharma added a comment - Am able to successfully execute the existing and newly added unit tests in the AtomicUpdatesTest. Log attached.
        Hide
        Anurag Sharma added a comment -

        Need more clarification regarding applicability of the patch on trunk

        I see following differences in the revision:
        during patch upload — core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java (revision 1631811)
        now — core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java (revision 1631826)

        My doubt is the revision will keep bumping and current revision and patch will never be in sync. So what's the correct way to apply the patch on trunk. I am new to the process so trying to understand.

        Show
        Anurag Sharma added a comment - Need more clarification regarding applicability of the patch on trunk I see following differences in the revision: during patch upload — core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java (revision 1631811) now — core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java (revision 1631826) My doubt is the revision will keep bumping and current revision and patch will never be in sync. So what's the correct way to apply the patch on trunk. I am new to the process so trying to understand.
        Hide
        Noble Paul added a comment -

        The patch would apply even if the files are modified. If there is a conflict it does not apply

        Show
        Noble Paul added a comment - The patch would apply even if the files are modified. If there is a conflict it does not apply
        Hide
        Noble Paul added a comment -

        patch refactored . Please review and let me know if I missed something.

        I shall commit this soon

        Show
        Noble Paul added a comment - patch refactored . Please review and let me know if I missed something. I shall commit this soon
        Hide
        Anurag Sharma added a comment -

        Thanks a lot for reviewing and refactoring it.

        Introducing the toNativeType() at FieldType is good. Only one thing I see is missed in class TrieIntField when overriding toNativeType. It is not parsing Double before Float TrieIntField. The intention of doing it was to find if Double can be extracted before Float. Rest is good.

        Snippet from earlier patch
        + // when Double value passed as a String
        + if(!removed && nonIntegerFormat)
        + removed = original.remove((new Double(Double.parseDouble(object.toString()))).intValue());
        + // when Float value passed as a String
        + if(!removed && nonIntegerFormat)
        + removed = original.remove(new Float(Float.parseFloat(object.toString())).intValue());

        Show
        Anurag Sharma added a comment - Thanks a lot for reviewing and refactoring it. Introducing the toNativeType() at FieldType is good. Only one thing I see is missed in class TrieIntField when overriding toNativeType. It is not parsing Double before Float TrieIntField. The intention of doing it was to find if Double can be extracted before Float. Rest is good. Snippet from earlier patch + // when Double value passed as a String + if(!removed && nonIntegerFormat) + removed = original.remove((new Double(Double.parseDouble(object.toString()))).intValue()); + // when Float value passed as a String + if(!removed && nonIntegerFormat) + removed = original.remove(new Float(Float.parseFloat(object.toString())).intValue());
        Hide
        Noble Paul added a comment -

        it's not possible to do both parsing . Does it make any difference?

        Show
        Noble Paul added a comment - it's not possible to do both parsing . Does it make any difference?
        Hide
        ASF subversion and git services added a comment -

        Commit 1632259 from Noble Paul in branch 'dev/trunk'
        [ https://svn.apache.org/r1632259 ]

        SOLR-6307

        Show
        ASF subversion and git services added a comment - Commit 1632259 from Noble Paul in branch 'dev/trunk' [ https://svn.apache.org/r1632259 ] SOLR-6307
        Hide
        ASF subversion and git services added a comment -

        Commit 1632260 from Noble Paul in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1632260 ]

        SOLR-6307

        Show
        ASF subversion and git services added a comment - Commit 1632260 from Noble Paul in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1632260 ] SOLR-6307
        Hide
        Anurag Sharma added a comment -

        Sorry I missed the last comment. With float I saw sometime it was truncating the value like 1111.666 to 1112 and look for removing 1112 so introduced parsing double before float. Both parsing can be done by introducing nested try and catch.

        Independent of above another issue I saw in remove api is it is not giving any message if the element/collection deleted successfully or failed. Probably this will help the caller to know the current state otherwise the caller have to make another call to real time /get call on the id to know the state. Should we raise another issue for this?

        Show
        Anurag Sharma added a comment - Sorry I missed the last comment. With float I saw sometime it was truncating the value like 1111.666 to 1112 and look for removing 1112 so introduced parsing double before float. Both parsing can be done by introducing nested try and catch. Independent of above another issue I saw in remove api is it is not giving any message if the element/collection deleted successfully or failed. Probably this will help the caller to know the current state otherwise the caller have to make another call to real time /get call on the id to know the state. Should we raise another issue for this?
        Hide
        Noble Paul added a comment -

        Please submit another patch here itself

        Show
        Noble Paul added a comment - Please submit another patch here itself
        Hide
        Anurag Sharma added a comment -

        Here is the patch parsing the Double before Float

        Show
        Anurag Sharma added a comment - Here is the patch parsing the Double before Float
        Hide
        Anshum Gupta added a comment -

        Bulk close after 5.0 release.

        Show
        Anshum Gupta added a comment - Bulk close after 5.0 release.

          People

          • Assignee:
            Noble Paul
            Reporter:
            Kun Xi
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development