Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15411

"failOnVersionConflicts" doesn't get checked in getUpdatedDocument when _version_==1

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 8.5.1
    • None
    • None

    Description

      I'm using solrj for Java to update 4 docs:

       

      UpdateRequest updateRequest = new UpdateRequest();    
      updateRequest.setAction( UpdateRequest.ACTION.COMMIT, false, false);
      
      for (Map<String, Object> map : maps) {      
           if (map.containsKey("record_uuid")) {        
                SolrInputDocument doc = new SolrInputDocument();
                doc.addField("assertionUserId", new HashMap<String, Object>(){{ put("set", map.getOrDefault("assertionUserId", null)); }});
                // other doc.addField() 
                logger.debug("Added solr doc for record: " + doc.get("id"));                  
                updateRequest.add(doc);      
            }    
      }
      
      // update only when there are docs to update    
      if (updateRequest.getDocuments() != null) {          
         updateRequest.setParam("_version_", "1");         
         updateRequest.setParam("failOnVersionConflicts", "false");         
         updateRequest.process(solrClient);      
       
      }
       
      
      

      There are 4 docs added into the updateRequest and the 2nd doc has an invalid id
       
      See that I've set the _version_ to 1 and failOnVersionConflicts to false
       
      When I ran the program I could see this log
       
       

      - Added solr doc for record: id=2c9b8671-a55a-40ab-940f-06f9cf987880
      - Added solr doc for record: id=c0ee1a86-1df6-40b2-950c-bdde40b1c46e_invalid
      - Added solr doc for record: id=af56ce03-e664-421a-85ac-fbb839bbb140
      - Added solr doc for record: id=6bdc3c1d-d21a-43e3-aa84-baeeda601bb3
      
      ERROR: [ConcurrentUpdateSolrClient] - errororg.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
      Error from server at http://localhost:8983/solr/biocache: Conflictrequest: http://localhost:8983/solr/biocache/update?commit=true&softCommit=false&waitSearcher=false&_version_=1&failOnVersionConflicts=false&wt=javabin&version=2›
      
      Remote error message: Document not found for update.  id=c0ee1a86-1df6-40b2-950c-bdde40b1c46e_invalid at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream  

       

      The error is Document not found for update which is as expected. Out of the 4 updates the 2nd one has an invlid id so what I expected is 2nd update failed but all the other 3 are updated successfully (since I set failOnVersionConflicts: false).

      But the reality is, only 1st update succeeded.  All others failed because an exception was thrown for 2nd update.

      I then went on to check the 8.5.1 sourcecode, the exception is thrown from getUpdatedDocument 

       

      SolrInputDocument oldRootDocWithChildren = RealTimeGetComponent.getInputDocument(cmd.getReq().getCore(), idBytes, RealTimeGetComponent.Resolution.ROOT_WITH_CHILDREN);
      
      if (oldRootDocWithChildren == null) {
       if (versionOnUpdate > 0) {
       // could just let the optimistic locking throw the error
       throw new SolrException(ErrorCode.CONFLICT, "Document not found for update. id=" + idString);
       } else if (req.getParams().get(ShardParams._ROUTE_) != null) {

      It tries to find a doc with the specified id (null is found in this case), then seeing versionOnUpdate == 1, an exception is thrown anyway. But where is the failOnVersionConflicts **used ?

       

      it's in doVersionAdd(). In this function we call getUpdateDocument first in line374, then we compare versionOnUpdate with foundVersion and the failOnVersionConflitcs flag. In my case there's no chance to check the flag because an exception is directly thrown before that.

       

      getUpdatedDocument(cmd, versionOnUpdate);
      
      // leaders can also be in buffering state during "migrate" API call, see SOLR-5308
      if (forwardedFromCollection && ulog.getState() != UpdateLog.State.ACTIVE
       && isReplayOrPeersync == false) {
       // ....
       return true;
      }
      
      if (versionOnUpdate != 0) {
       Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
       long foundVersion = lastVersion == null ? -1 : lastVersion;
       if (versionOnUpdate == foundVersion || (versionOnUpdate < 0 && foundVersion < 0)
       || (versionOnUpdate == 1 && foundVersion > 0)) {
       // we're ok if versions match, or if both are negative (all missing docs are equal), or if cmd
       // specified it must exist (versionOnUpdate==1) and it does.
       } else {
       if(cmd.getReq().getParams().getBool(CommonParams.FAIL_ON_VERSION_CONFLICTS, true) == false) {
       return true;
       }

      It looks like an issue. 

       
       =======  update ==========

      according to solr documentation 

      [Updating Parts of Documents | Apache Solr Reference Guide 8.5|https://solr.apache.org/guide/8_5/updating-parts-of-documents.html]
       

      $ curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/techproducts/update?versions=true&_version_=-1&failOnVersionConflicts=false&omitHeader=true' --data-binary ' [ { "id" : "aaa" }, { "id" : "ccc" } ]'
      In this example, we have added 2 documents "aaa" and "ccc". As we have specified the parameter _version_=-1, this request should not add the document with the id aaa because it already exists. The request succeeds & does not throw any error because the failOnVersionConflicts=false parameter is specified

      If solr can recognize _version= -1 and failOnVersionConflicts=false and then only updates those not existing. There's no reason it can't do similar to version_ = 1

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hxuanyu xuanyu huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m