CouchDB
  1. CouchDB
  2. COUCHDB-911

Same ocurrence of a document in _bulk_docs generates erroneous conflict messages

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0
    • Fix Version/s: 1.2, 1.3
    • Component/s: HTTP Interface
    • Labels:
      None
    • Environment:

      Cloudant BigCouch EC2 node

    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      Repeating an "_id" in a _bulk_docs post data file results in both entries being reported as document conflict errors. The first occurrence actual inserts into the database, and only the second occurrence should report a conflict.

      curl -d '{ "docs": [

      {"_id":"foo"}

      ,

      {"_id","foo"}

      ] }' -H 'Content-Type:application/json' -X POST http://appadvice.cloudant.com/foo/_bulk_docs

      [

      {"id":"foo","error":"conflict","reason":"Document update conflict."}

      ,

      {"id":"foo","error":"conflict","reason":"Document update conflict."}

      ]

      But the database shows that one new document was actually inserted.

      Only the second occurrence should report conflict. The first occurrence should report the "_rev" property of the newly inserted doc.

        Activity

        Hide
        Klaus Trainer added a comment -

        Trying your example request with CouchDB 1.0.1, I get the following error message:
        {"error":"bad_request","reason":"invalid UTF-8 JSON: <<\"{\\\"docs\\\": [

        {\\\"_id\\\":\\\"foo\\\"}

        ,

        {\\\"_id\\\",\\\"foo\\\"}

        ]}\">>"}

        When replacing the comma with a colon, I get this result:

        {"ok":true,"id":"87bea69a9e2eb988d467183b6100084a","rev":"1-ad4b328fcf36f99be059e435c1c32c1c"}

        Obviously, that's not a CouchDB issue.

        Why not just report Cloudant/BigCouch issues that are no CouchDB issues to Cloudant or rather to the BigCouch project.

        • Klaus
        Show
        Klaus Trainer added a comment - Trying your example request with CouchDB 1.0.1, I get the following error message: {"error":"bad_request","reason":"invalid UTF-8 JSON: <<\"{\\\"docs\\\": [ {\\\"_id\\\":\\\"foo\\\"} , {\\\"_id\\\",\\\"foo\\\"} ]}\">>"} When replacing the comma with a colon, I get this result: {"ok":true,"id":"87bea69a9e2eb988d467183b6100084a","rev":"1-ad4b328fcf36f99be059e435c1c32c1c"} Obviously, that's not a CouchDB issue. Why not just report Cloudant/BigCouch issues that are no CouchDB issues to Cloudant or rather to the BigCouch project. Klaus
        Hide
        Klaus Trainer added a comment -

        Sorry, I need to currect myself. I was too stupid to correctly specify the _bulk_docs URL.

        I can confirm that the described behavior also applies to CouchDB 1.0.1.

        curl http://127.0.0.1:5984/db/_bulk_docs -d '{"docs": [

        {"_id":"foo"}

        ,

        {"_id":"foo"}

        ]}' -H 'Content-Type: application/json'
        [

        {"id":"foo","error":"conflict","reason":"Document update conflict."}

        ,

        {"id":"foo","error":"conflict","reason":"Document update conflict."}

        ]

        However, one of the updates actually succeeded.

        Note that when "all_or_nothing": true is specified, the result is different.

        curl http://127.0.0.1:5984/db/_bulk_docs -d '{"all_or_nothing": true, "docs": [

        {"_id":"foo"}

        ,

        {"_id":"foo"}

        ]}' -H 'Content-Type: application/json'
        [

        {"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}

        ,

        {"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"}

        ]

        Show
        Klaus Trainer added a comment - Sorry, I need to currect myself. I was too stupid to correctly specify the _bulk_docs URL. I can confirm that the described behavior also applies to CouchDB 1.0.1. curl http://127.0.0.1:5984/db/_bulk_docs -d '{"docs": [ {"_id":"foo"} , {"_id":"foo"} ]}' -H 'Content-Type: application/json' [ {"id":"foo","error":"conflict","reason":"Document update conflict."} , {"id":"foo","error":"conflict","reason":"Document update conflict."} ] However, one of the updates actually succeeded. Note that when "all_or_nothing": true is specified, the result is different. curl http://127.0.0.1:5984/db/_bulk_docs -d '{"all_or_nothing": true, "docs": [ {"_id":"foo"} , {"_id":"foo"} ]}' -H 'Content-Type: application/json' [ {"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"} , {"id":"foo","rev":"1-967a00dff5e02add41819138abb3284d"} ]
        Hide
        Paul Joseph Davis added a comment -

        You've sent two write commands to the database using a single _rev token. This means you've got a document that's entered into a conflict. Having the first occurrence in the results not be in conflict would be misleading because it is in conflict.

        Theoretically we could add a check to reject _bulk_docs updates with duplicate _id, _rev pairs, but that doesn't seem like a good idea for some reason.

        Show
        Paul Joseph Davis added a comment - You've sent two write commands to the database using a single _rev token. This means you've got a document that's entered into a conflict. Having the first occurrence in the results not be in conflict would be misleading because it is in conflict. Theoretically we could add a check to reject _bulk_docs updates with duplicate _id, _rev pairs, but that doesn't seem like a good idea for some reason.
        Hide
        Jan Lehnardt added a comment -

        +1 Paul.

        Show
        Jan Lehnardt added a comment - +1 Paul.
        Hide
        Adam Kocoloski added a comment -

        I think the issue here is that we responde with two conflict errors, but we do end up saving the document in the database. If I get a 409 Conflict response from CouchDB I assume the database rejected the request, but it's not the case here.

        Show
        Adam Kocoloski added a comment - I think the issue here is that we responde with two conflict errors, but we do end up saving the document in the database. If I get a 409 Conflict response from CouchDB I assume the database rejected the request, but it's not the case here.
        Hide
        Bob Dionne added a comment -

        so actually when there are two docs of the same id in _bulk_docs, it's the second that persists and the first that generates a conflict in merge_rev_trees, because group_alike_docs[1] reverses the order of the docs in a bucket. To preserve the order we need another reverse here. If we fixed this, we could hack the ResultsDict[2], as we're processing the docs, and return only the second doc as a conflict. But this would only solve the bulk_docs case.

        There's also the issue of couch_db_updater:collect_updates [3] that appears to allow more grouped docs to be added for other clients? This is ugly, perhaps it needs to be disabled for the bulk_docs case. The alternative would be to add some bits to the handle_info({update_docs.... call to track which docs turn out to have conflicts.

        It's not pretty

        [1] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db.erl#L435
        [2] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db.erl#L759
        [3] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db_updater.erl#L274

        Show
        Bob Dionne added a comment - so actually when there are two docs of the same id in _bulk_docs, it's the second that persists and the first that generates a conflict in merge_rev_trees, because group_alike_docs [1] reverses the order of the docs in a bucket. To preserve the order we need another reverse here. If we fixed this, we could hack the ResultsDict [2] , as we're processing the docs, and return only the second doc as a conflict. But this would only solve the bulk_docs case. There's also the issue of couch_db_updater:collect_updates [3] that appears to allow more grouped docs to be added for other clients? This is ugly, perhaps it needs to be disabled for the bulk_docs case. The alternative would be to add some bits to the handle_info({update_docs.... call to track which docs turn out to have conflicts. It's not pretty [1] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db.erl#L435 [2] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db.erl#L759 [3] https://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_db_updater.erl#L274
        Hide
        Paul Joseph Davis added a comment -

        Awesome work tracking this down. I don't have much to add just yet, but I was going to ask if we really care that it's the first doc that's accepted and not the second? As long as the correct one is marked as a conflict does it matter what the order was in the _bulk_docs request?

        Show
        Paul Joseph Davis added a comment - Awesome work tracking this down. I don't have much to add just yet, but I was going to ask if we really care that it's the first doc that's accepted and not the second? As long as the correct one is marked as a conflict does it matter what the order was in the _bulk_docs request?
        Hide
        Bob Dionne added a comment -

        No it really doesn't matter. One could argue that since the list of buckets is reversed then the buckets should also be reversed to preserve the order, or neither should.

        So here's a solution[1] that works that does depend on update_docs being called with the docs in the correct order. It needs a bit of refactoring but the idea itself is simple. It removes the dictionary since dict:from_list throws away the dups, and works thru the list of docs backwards, looking them up in the results list from the commits. If conflicts are present they are matched to the correct docs. As they are used they are removed from the commits list.

        It's not pretty but I should be able to simplify, the only alternative I could think of involves passing some additional bits in to the elephant grave yard that is couch_db_updater.

        [1] https://github.com/cloudant/bigcouch/commit/3fbb4fd1caa0d998

        Show
        Bob Dionne added a comment - No it really doesn't matter. One could argue that since the list of buckets is reversed then the buckets should also be reversed to preserve the order, or neither should. So here's a solution [1] that works that does depend on update_docs being called with the docs in the correct order. It needs a bit of refactoring but the idea itself is simple. It removes the dictionary since dict:from_list throws away the dups, and works thru the list of docs backwards, looking them up in the results list from the commits. If conflicts are present they are matched to the correct docs. As they are used they are removed from the commits list. It's not pretty but I should be able to simplify, the only alternative I could think of involves passing some additional bits in to the elephant grave yard that is couch_db_updater. [1] https://github.com/cloudant/bigcouch/commit/3fbb4fd1caa0d998
        Hide
        Jason Smith added a comment -

        In another discussion, Adam said there is no global ordering of events.

        http://mail-archives.apache.org/mod_mbox/couchdb-dev/201108.mbox/%3c54390D26-2473-4B42-A2F2-EBBC7ED91D5A@apache.org%3e

        A bulk update can contain a ddoc which would reject documents, they still save, because the commit order is undefined.

        Thus instead of "it doesn't matter" which doc is saved and which are rejected, I believe the final answer is "it is not defined" which is saved.

        Show
        Jason Smith added a comment - In another discussion, Adam said there is no global ordering of events. http://mail-archives.apache.org/mod_mbox/couchdb-dev/201108.mbox/%3c54390D26-2473-4B42-A2F2-EBBC7ED91D5A@apache.org%3e A bulk update can contain a ddoc which would reject documents, they still save, because the commit order is undefined. Thus instead of "it doesn't matter" which doc is saved and which are rejected, I believe the final answer is "it is not defined" which is saved.
        Hide
        Bob Dionne added a comment -

        I think that's true, you might have a dup in the bulk docs request and both are rejected because a third with the same id was sent from another client. All this patch does is match the docs that are sent to couch_db_updater with any conflict messages that are returned. If you send docs A,B,C,D, and E they will be processed in this order. I think this is true, even though other clients might send updates for B and C and they will be grouped in. But the client that sent A,B,C,D, and E will see any conflicts returned in that order.

        Two docs with the same id might have different bodies, .ie. aren't really dups. It's important to indicate which is rejected and which committed. Currently for both this case and the case of dups we save the second (because we reverse the order) and report both rejected, which leaves the database in an inconsistent state.

        Show
        Bob Dionne added a comment - I think that's true, you might have a dup in the bulk docs request and both are rejected because a third with the same id was sent from another client. All this patch does is match the docs that are sent to couch_db_updater with any conflict messages that are returned. If you send docs A,B,C,D, and E they will be processed in this order. I think this is true, even though other clients might send updates for B and C and they will be grouped in. But the client that sent A,B,C,D, and E will see any conflicts returned in that order. Two docs with the same id might have different bodies, .ie. aren't really dups. It's important to indicate which is rejected and which committed. Currently for both this case and the case of dups we save the second (because we reverse the order) and report both rejected, which leaves the database in an inconsistent state.
        Hide
        Bob Dionne added a comment -

        Here's a JS test and a draft of a patch.

        Show
        Bob Dionne added a comment - Here's a JS test and a draft of a patch.
        Hide
        Bob Dionne added a comment -

        Here's a somewhat cleaner version

        Show
        Bob Dionne added a comment - Here's a somewhat cleaner version
        Hide
        Bob Dionne added a comment -

        Here's a patch that adds a reference to each doc in order to distinguish the dups. It's a much more substantial change but arguably safer. All the tests pass.

        Show
        Bob Dionne added a comment - Here's a patch that adds a reference to each doc in order to distinguish the dups. It's a much more substantial change but arguably safer. All the tests pass.
        Hide
        Bob Dionne added a comment -

        updated patch to current trunk

        Show
        Bob Dionne added a comment - updated patch to current trunk
        Hide
        Adam Kocoloski added a comment -

        This is a bug in group commits. _bulk_docs is one way to generate a group commit, but it's not the only one. Two concurrent writers trying to update the same document will also trigger this bug. They'll both receive a 409 Conflict response, but one of the edits will be saved.

        We can't sweep this one under the rug by talking about proper usage of _bulk_docs.

        Show
        Adam Kocoloski added a comment - This is a bug in group commits. _bulk_docs is one way to generate a group commit, but it's not the only one. Two concurrent writers trying to update the same document will also trigger this bug. They'll both receive a 409 Conflict response, but one of the edits will be saved. We can't sweep this one under the rug by talking about proper usage of _bulk_docs.
        Hide
        Filipe Manana added a comment -

        Hi Bob, the patch looks correct to me, so +1.

        While I understand the issue with bulk docs (because of the way we collect the results in the dictionary), looking briefly at the code, I don't see yet how the same issue can happen when we have multiple clients updating the same doc.

        I made a test to confirm this, which is attached here and spawns many clients which attempt to update the same doc.

        Running this test, with the following one liner on top of it:

        http://friendpaste.com/I466uiIaQ8yzk0LUKPWAf

        I get the following sample output:

        http://friendpaste.com/3aMN5N8IWRuvw62hTbrZzB

        So effectively it seems to me it covers that case. For 5000 clients for e.g., the updater was able to collect 80 updates for the same document.

        Anyway, this is a good test to have around.

        Minor detail: you're using some Emacs mode indentation which is very different from what we use everywhere else.

        + Docs2 = lists:map(fun(Doc) ->
        +

        {Doc, make_ref()}

        + end,Docs),

        Show
        Filipe Manana added a comment - Hi Bob, the patch looks correct to me, so +1. While I understand the issue with bulk docs (because of the way we collect the results in the dictionary), looking briefly at the code, I don't see yet how the same issue can happen when we have multiple clients updating the same doc. I made a test to confirm this, which is attached here and spawns many clients which attempt to update the same doc. Running this test, with the following one liner on top of it: http://friendpaste.com/I466uiIaQ8yzk0LUKPWAf I get the following sample output: http://friendpaste.com/3aMN5N8IWRuvw62hTbrZzB So effectively it seems to me it covers that case. For 5000 clients for e.g., the updater was able to collect 80 updates for the same document. Anyway, this is a good test to have around. Minor detail: you're using some Emacs mode indentation which is very different from what we use everywhere else. + Docs2 = lists:map(fun(Doc) -> + {Doc, make_ref()} + end,Docs),
        Hide
        Adam Kocoloski added a comment -

        Well, I stand corrected don't I? Great test case Filipe – I don't know why I thought a dict was being used in the db_updater code path.

        Show
        Adam Kocoloski added a comment - Well, I stand corrected don't I? Great test case Filipe – I don't know why I thought a dict was being used in the db_updater code path.
        Hide
        Filipe Manana added a comment -

        Thanks Adam. My understanding from your previous comment was that if 2 clients try to update the same doc and the updater collects both updates, it would send a conflict error to both clients (while what happened was that one doc got persisted and the other didn't).

        Show
        Filipe Manana added a comment - Thanks Adam. My understanding from your previous comment was that if 2 clients try to update the same doc and the updater collects both updates, it would send a conflict error to both clients (while what happened was that one doc got persisted and the other didn't).
        Hide
        Adam Kocoloski added a comment -

        Yes, that was my (incorrect) assertion.

        Show
        Adam Kocoloski added a comment - Yes, that was my (incorrect) assertion.
        Hide
        Bob Dionne added a comment -

        Thanks Filipe, I ran the test fine and modified the patch to conform to the couchdb style

        Show
        Bob Dionne added a comment - Thanks Filipe, I ran the test fine and modified the patch to conform to the couchdb style
        Hide
        Jan Lehnardt added a comment -

        You guys rule, just sayin'!

        Show
        Jan Lehnardt added a comment - You guys rule, just sayin'!
        Hide
        Bob Dionne added a comment -

        Add references to docs to prevent dups from being collapsed

        Bulk docs requests may have duplicates or multiple docs with the same id.
        These were being collapsed in a dictionary as messages are passed from
        merge_rev_trees in couch_db_updater to collect_results in couch_db.
        Attaching a reference allows each to be processed correctly.

        Show
        Bob Dionne added a comment - Add references to docs to prevent dups from being collapsed Bulk docs requests may have duplicates or multiple docs with the same id. These were being collapsed in a dictionary as messages are passed from merge_rev_trees in couch_db_updater to collect_results in couch_db. Attaching a reference allows each to be processed correctly.
        Hide
        Filipe Manana added a comment -

        Updated the ticket's title because this really didn't affect concurrent updates.

        Show
        Filipe Manana added a comment - Updated the ticket's title because this really didn't affect concurrent updates.

          People

          • Assignee:
            Bob Dionne
            Reporter:
            Jay Nelson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 48h
              48h
              Remaining:
              Remaining Estimate - 48h
              48h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development