CouchDB
  1. CouchDB
  2. COUCHDB-1163

Document returned by id, but cannot be found by rev

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.0.1, 1.0.2
    • Fix Version/s: 1.0.3, 1.1, 1.2
    • Component/s: Database Core
    • Labels:
      None
    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      Somehow, our cluster has developed the following problem on a handful of documents. Will post reproduction steps if we find them. All properties have been redacted. All the documents this affects also have attachments, if that is significant. Once a document is in this situation, it causes conflict detection, replication and include_docs to behave incorrectly or outright fail.

      GET /database/4cdee83a118ea1cf3050b1d006144d46 returns

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "10-df4bf65a6104ea240f100c30d3cb245d", "foo": "bar" }

      GET /database/4cdee83a118ea1cf3050b1d006144d46?open_revs=all returns
      [
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "10-df4bf65a6104ea240f100c30d3cb245d", "foo": "bar" }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "8-eea5e36daee12acd79a127abf36f7720", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "9-2cead1e4c813a4f0d10a9bc4aa28bfda", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "7-c3b44f004660caa496804409089b53d9", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "6-52e978041bb324d19e01a2ac5a243702", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "5-761bf28c6989f0fde41bdd5732c33159", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "4-abb005cf4b2d2dd12880a33af1e7066e", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "3-233e4624e620ec1c8b66f21a051832f8", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "10-55f0cdf9dd95ed230b733a2c826c842c", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "11-264c9d6c249ba2fc9b13df35cb447fd7", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "9-2cead1e4c813a4f0d10a9bc4aa28bfda", _deleted: true }

      },
      {
      "ok":

      { "_id": "4cdee83a118ea1cf3050b1d006144d46", "_rev": "2-9f2df19059d9a460a12740a63a4d95e9", _deleted: true }

      }
      ]

      GET /database/4cdee83a118ea1cf3050b1d006144d46?rev=10-df4bf65a6104ea240f100c30d3cb245d returns

      { "error": "not_found", "reason": "missing" }
      1. cleaned-up.txt
        2 kB
        Paul Joseph Davis
      2. COUCHDB-1163.patch
        21 kB
        Paul Joseph Davis
      3. COUCHDB-1163-1.1.x.patch
        23 kB
        Paul Joseph Davis
      4. Couch logging for jira issue
        7 kB
        Clare Walsh
      5. formatted_rev_tree.txt
        3 kB
        Paul Joseph Davis

        Issue Links

          Activity

          Hide
          Robert Newson added a comment -

          Documents affected by COUCHDB-885 (where important rev information was not replicated) can have errors in their rev tree. The patch for this ticket corrects those errors when the document is subsequently updated or compacted.

          Show
          Robert Newson added a comment - Documents affected by COUCHDB-885 (where important rev information was not replicated) can have errors in their rev tree. The patch for this ticket corrects those errors when the document is subsequently updated or compacted.
          Hide
          Robert Newson added a comment -

          Not yet, but it's on my list for tomorrow.

          Show
          Robert Newson added a comment - Not yet, but it's on my list for tomorrow.
          Hide
          Paul Joseph Davis added a comment -

          Whoops, I was totally reading the reports wrong earlier.

          But Clare summed it up nicely in her question:

          "Is the consensus that they were deleted but the wrong revision was winning because of a bug in the tree making it look not deleted?"

          To which the answer is yes. Apologies for the confusion.

          So I'm pretty certain that this has fixed the issue. Has anyone checked on the other two tickets that are marked as dupes of this one?

          Show
          Paul Joseph Davis added a comment - Whoops, I was totally reading the reports wrong earlier. But Clare summed it up nicely in her question: "Is the consensus that they were deleted but the wrong revision was winning because of a bug in the tree making it look not deleted?" To which the answer is yes. Apologies for the confusion. So I'm pretty certain that this has fixed the issue. Has anyone checked on the other two tickets that are marked as dupes of this one?
          Hide
          Paul Joseph Davis added a comment -

          @James

          Oh cool. Yeah, the more I think about it I think maybe we might be better off doing a check at some point in the process. I'll go ponder the code a bit and see if anything interesting occurs to me.

          Show
          Paul Joseph Davis added a comment - @James Oh cool. Yeah, the more I think about it I think maybe we might be better off doing a check at some point in the process. I'll go ponder the code a bit and see if anything interesting occurs to me.
          Hide
          James Howe added a comment -

          Just FYI we've already rescued all these documents into a clean database, taking whatever the GET without rev returned as "correct". Thus we don't really mind what rescuing you implement, but it seems most users would probably want to err on the side of caution if data loss is a possibility.

          Show
          James Howe added a comment - Just FYI we've already rescued all these documents into a clean database, taking whatever the GET without rev returned as "correct". Thus we don't really mind what rescuing you implement, but it seems most users would probably want to err on the side of caution if data loss is a possibility.
          Hide
          Paul Joseph Davis added a comment -

          @James

          A() -> <<223,75,246,90,97,4,234,36,15,16,12,48,211,203,36,93>>.
          B() -> <<38,76,157,108,36,155,162,252,155,19,223,53,203,68,127,215>>.

          Unfortunately there's no real way to tell. From the example doc in formatted_rev_tree.txt we have a doc value for A() in the branch that starts 9 edits deep (the last branch in that file) but we also have that same hash in the larger first branch. And in the larger first branch, A() has been deleted which is indicated by B(). A proper merge of the trees by definition would say that A() must have been deleted somewhere which means that for all intents and purposes it is deleted. In this particular case what we really want is to rescue the body pointer from A() and introduce it as a new revision in the tree (or alternatively, forget all about B()).

          On the other hand, I could write a fixup tool that goes in and does some fancy pants logic to try and tell if there's an issue and reports those documents that would suddenly drop out after a compaction.

          Show
          Paul Joseph Davis added a comment - @James A() -> <<223,75,246,90,97,4,234,36,15,16,12,48,211,203,36,93>>. B() -> <<38,76,157,108,36,155,162,252,155,19,223,53,203,68,127,215>>. Unfortunately there's no real way to tell. From the example doc in formatted_rev_tree.txt we have a doc value for A() in the branch that starts 9 edits deep (the last branch in that file) but we also have that same hash in the larger first branch. And in the larger first branch, A() has been deleted which is indicated by B(). A proper merge of the trees by definition would say that A() must have been deleted somewhere which means that for all intents and purposes it is deleted. In this particular case what we really want is to rescue the body pointer from A() and introduce it as a new revision in the tree (or alternatively, forget all about B()). On the other hand, I could write a fixup tool that goes in and does some fancy pants logic to try and tell if there's an issue and reports those documents that would suddenly drop out after a compaction.
          Hide
          James Howe added a comment -

          If there's doubt about whether the revision that's still there should be deleted or not, the tree-fixing code should probably preserve it above deletions, to avoid data loss.

          Show
          James Howe added a comment - If there's doubt about whether the revision that's still there should be deleted or not, the tree-fixing code should probably preserve it above deletions, to avoid data loss.
          Hide
          Robert Newson added a comment -

          Paul reminds me that the non-deleted revision is not currently a leaf, so my solution above wouldn't work.

          Show
          Robert Newson added a comment - Paul reminds me that the non-deleted revision is not currently a leaf, so my solution above wouldn't work.
          Hide
          Clare Walsh added a comment -

          If the doc is supposed to be deleted then deleted is what we want, we don't want to bring it back to life, just get it into its correct state and make replication not fail on it...

          Is the consensus that they were deleted but the wrong revision was winning because of a bug in the tree making it look not deleted?

          (Am about to try replicating the compacted and therefore corrected database...)

          Show
          Clare Walsh added a comment - If the doc is supposed to be deleted then deleted is what we want, we don't want to bring it back to life, just get it into its correct state and make replication not fail on it... Is the consensus that they were deleted but the wrong revision was winning because of a bug in the tree making it look not deleted? (Am about to try replicating the compacted and therefore corrected database...)
          Hide
          Robert Newson added a comment -

          I think so, yes. We can alter the outcome so that you get the winning revision you want.

          Basically, delete everything you don't want (the deleted:true revisions), and the doc will 'undelete'. Obvious warning: do no compact until you have done this.

          Show
          Robert Newson added a comment - I think so, yes. We can alter the outcome so that you get the winning revision you want. Basically, delete everything you don't want (the deleted:true revisions), and the doc will 'undelete'. Obvious warning: do no compact until you have done this.
          Hide
          Clare Walsh added a comment -

          By id only:
          only 404 for each of: deleted=true, conflicts=true, meta=true, revs=true
          open_revs=all gives a list which no longer contains duplicates, doesn't contain the revision that was previously returned as the current doc and doesn't contain anything not deleted (will post list if wanted)

          By id + rev:
          revs=true returns as expected, none of the others seem to make any difference
          open_revs=all gives the same thing as above

          So... the doc is supposed to be deleted, it was only the bug that was making it show up as not deleted before?

          Show
          Clare Walsh added a comment - By id only: only 404 for each of: deleted=true, conflicts=true, meta=true, revs=true open_revs=all gives a list which no longer contains duplicates, doesn't contain the revision that was previously returned as the current doc and doesn't contain anything not deleted (will post list if wanted) By id + rev: revs=true returns as expected, none of the others seem to make any difference open_revs=all gives the same thing as above So... the doc is supposed to be deleted, it was only the bug that was making it show up as not deleted before?
          Hide
          Paul Joseph Davis added a comment -

          Another reference file.

          This is the same tree that is in the other reference file I posted, but after stemming with the new stemming code.

          As you can see, the way that the rev tree ends up, the old rev that appeared to be newest (that starts with the byte 223) is actually not the tip of that revision history path thing, the leaf is the one after it is deleted. So the data is still there, but the tree had gotten wonky so that the cleaned up version makes it appear deleted.

          So, the doc becoming deleted after cleanup does make sense. But this makes it a bit more difficult to figure out how to find and clean other docs.

          Show
          Paul Joseph Davis added a comment - Another reference file. This is the same tree that is in the other reference file I posted, but after stemming with the new stemming code. As you can see, the way that the rev tree ends up, the old rev that appeared to be newest (that starts with the byte 223) is actually not the tip of that revision history path thing, the leaf is the one after it is deleted. So the data is still there, but the tree had gotten wonky so that the cleaned up version makes it appear deleted. So, the doc becoming deleted after cleanup does make sense. But this makes it a bit more difficult to figure out how to find and clean other docs.
          Hide
          Paul Joseph Davis added a comment -

          New patch is same as the old patch except I fixed a slight thinger that shouldn't matter too much though it might possibly.

          When you say its deleted, did you check to see if it was in conflict? Also, can you get all of the revisions with the meta=true option and list those?

          Show
          Paul Joseph Davis added a comment - New patch is same as the old patch except I fixed a slight thinger that shouldn't matter too much though it might possibly. When you say its deleted, did you check to see if it was in conflict? Also, can you get all of the revisions with the meta=true option and list those?
          Hide
          Clare Walsh added a comment - - edited

          Ummm...
          So I updated to the latest in trunk, applied the patch (thank you very much for the patch btw), re built and installed and everything...
          Checked that I still had the same behaviour...
          Tried to edit, but couldn't (which isn't that surprising since I couldn't before)...
          Compacted...
          Then when I tried to go to the document just by id it told me the document was deleted...

          Having assumed I'd messed something up I stopped couch, remounted the original db file, restarted, got the document by id successfully, compacted again, and then it still said the document was deleted...

          Just to double check I repeated the process again, same thing... but interestingly when the document says it's deleted (after compaction) I now can get it by revision...

          [EDIT: Hadn't seen your *NEW* patch... will go stick that in now, sorry]

          Show
          Clare Walsh added a comment - - edited Ummm... So I updated to the latest in trunk, applied the patch (thank you very much for the patch btw), re built and installed and everything... Checked that I still had the same behaviour... Tried to edit, but couldn't (which isn't that surprising since I couldn't before)... Compacted... Then when I tried to go to the document just by id it told me the document was deleted... Having assumed I'd messed something up I stopped couch, remounted the original db file, restarted, got the document by id successfully, compacted again, and then it still said the document was deleted... Just to double check I repeated the process again, same thing... but interestingly when the document says it's deleted (after compaction) I now can get it by revision... [EDIT: Hadn't seen your *NEW* patch... will go stick that in now, sorry]
          Hide
          Paul Joseph Davis added a comment -

          Updated patches to make guard conditions more better.

          Show
          Paul Joseph Davis added a comment - Updated patches to make guard conditions more better.
          Hide
          Paul Joseph Davis added a comment -

          Good catch. Looks like I broke it in both places. Forgot to switch it to three in the 1.1.x patch and forgot that it could be 3 before a compaction on trunk. Stayed tuned for new patches in a minute.

          Also, this is exactly the sort of weirdness that I was worried that variable sized tuples were going to introduce. We should really get around to making those into records.

          Show
          Paul Joseph Davis added a comment - Good catch. Looks like I broke it in both places. Forgot to switch it to three in the 1.1.x patch and forgot that it could be 3 before a compaction on trunk. Stayed tuned for new patches in a minute. Also, this is exactly the sort of weirdness that I was worried that variable sized tuples were going to introduce. We should really get around to making those into records.
          Hide
          Filipe Manana added a comment -

          Paul I have only one comment:

          In couch_doc:kt_value_chooser/2, a non missing leaf's value is always a 3 elements tuple in 1.1.x and 1.0.x. On trunk it can be 4 elements or 3 (in case we're using DBs created with older releases).

          Show
          Filipe Manana added a comment - Paul I have only one comment: In couch_doc:kt_value_chooser/2, a non missing leaf's value is always a 3 elements tuple in 1.1.x and 1.0.x. On trunk it can be 4 elements or 3 (in case we're using DBs created with older releases).
          Hide
          Paul Joseph Davis added a comment -

          This should apply against 1.1.x. Haven't run any tests with it. I'm tired and heading to bed.

          Show
          Paul Joseph Davis added a comment - This should apply against 1.1.x. Haven't run any tests with it. I'm tired and heading to bed.
          Hide
          Paul Joseph Davis added a comment -

          BIG NOTE This patch is only valid on trunk. I'll paste another shortly that will apply to 1.1.x

          formatted_rev_tree.txt is just the logging data but I've taken one of the weirdo cases and formatted it so its easier to think about it. I found it useful when trying to figure out what was going on.

          COUCHDB-1163.patch is an attempt at translating the last couple comments of crazy talk into code. This is based on the work me and Adam did earlier tracking down what was going on. There's really not a whole lot of code to it but the addition of the Choose function to merge/stem/remove_leafs makes it look like there's more than there is.

          So far this has passed make distcheck. Its about to pass Futon tests as well. Granted we don't really have a specific pathological test case anywhere for it, so we won't really know till we get one or users start testing it.

          Bob Dionne said he was going to try and get a case that exhibits the behavior tomorrow so we can assert more forcefully that this does fix it.

          And futon finished successfully...

          Show
          Paul Joseph Davis added a comment - BIG NOTE This patch is only valid on trunk. I'll paste another shortly that will apply to 1.1.x formatted_rev_tree.txt is just the logging data but I've taken one of the weirdo cases and formatted it so its easier to think about it. I found it useful when trying to figure out what was going on. COUCHDB-1163 .patch is an attempt at translating the last couple comments of crazy talk into code. This is based on the work me and Adam did earlier tracking down what was going on. There's really not a whole lot of code to it but the addition of the Choose function to merge/stem/remove_leafs makes it look like there's more than there is. So far this has passed make distcheck. Its about to pass Futon tests as well. Granted we don't really have a specific pathological test case anywhere for it, so we won't really know till we get one or users start testing it. Bob Dionne said he was going to try and get a case that exhibits the behavior tomorrow so we can assert more forcefully that this does fix it. And futon finished successfully...
          Hide
          Adam Kocoloski added a comment -

          Paul also pointed out that in this edge case couch_key_tree will need to deviate from its usual logic of always preferring the existing value over the one in the inserted path. I think we agree that the key_tree merge function should allow the user to supply a function to merge values for a given key. The default merge function would prefer summary tuples > in-memory document bodies > ?REV_MISSING markers.

          Show
          Adam Kocoloski added a comment - Paul also pointed out that in this edge case couch_key_tree will need to deviate from its usual logic of always preferring the existing value over the one in the inserted path. I think we agree that the key_tree merge function should allow the user to supply a function to merge values for a given key. The default merge function would prefer summary tuples > in-memory document bodies > ?REV_MISSING markers.
          Hide
          Paul Joseph Davis added a comment -

          For posterity:

          The key part that made sorting the paths by starting revision was to think of it as "always merge paths in order of paths closest to the root of the abstract tree". I call it abstract because after we stem far enough the 'tree' ends up becoming a set of trees paired with a distance to the original root. Sorting paths so that we start out closest to root should guarantee that we never inadvertently end up with a tree that has paths that could be merged together which should theoretically prevent duped values in the tree. I have a beautiful proof for this but alas it is too long to fit into the margin.

          Also, as Adam mentions, anytime a document is altered it would trigger this cleanup code which would fix the internal state of all docs affected by the duped revision bug. I'm also informed that running a compaction will stem document revisions so that'd be a way to fix up an entire db.

          Show
          Paul Joseph Davis added a comment - For posterity: The key part that made sorting the paths by starting revision was to think of it as "always merge paths in order of paths closest to the root of the abstract tree". I call it abstract because after we stem far enough the 'tree' ends up becoming a set of trees paired with a distance to the original root. Sorting paths so that we start out closest to root should guarantee that we never inadvertently end up with a tree that has paths that could be merged together which should theoretically prevent duped values in the tree. I have a beautiful proof for this but alas it is too long to fit into the margin. Also, as Adam mentions, anytime a document is altered it would trigger this cleanup code which would fix the internal state of all docs affected by the duped revision bug. I'm also informed that running a compaction will stem document revisions so that'd be a way to fix up an entire db.
          Hide
          Adam Kocoloski added a comment -

          Summarizing some IRC discussion with Paul:

          The problem in this ticket is definitely related to the bug in the replicator that was fixed in COUCHDB-885. When the replicator pushed documents with attachments to a remote target it sent the leaf revision but omitted information about the revision path that led to that leaf. As a result, the target didn't know how to merge that revision into its tree, and a conflict was created. If you like, you can think of it as imposing an effective _revs_limit of 1.

          However, that bug by itself doesn't introduce duplicate hashes in the tree. The duplicates are introduced when the attachments are removed from all leafs of one of these documents (possibly by deleting the revisions) and a replication is triggered. At that point the replicator uses a different write path unaffected by the COUCHDB-885 bug and transfers the full revision paths to the target. The target database should have figured out how to merge the various disconnected branches back together into a single tree at that point, but it failed. That's the bug we need to address here. It appears that the fix is to sort the paths by starting revision before merging them back together.

          After the eventual patch for this ticket is applied users should be able to repair affected documents by updating any leaf revision.

          Show
          Adam Kocoloski added a comment - Summarizing some IRC discussion with Paul: The problem in this ticket is definitely related to the bug in the replicator that was fixed in COUCHDB-885 . When the replicator pushed documents with attachments to a remote target it sent the leaf revision but omitted information about the revision path that led to that leaf. As a result, the target didn't know how to merge that revision into its tree, and a conflict was created. If you like, you can think of it as imposing an effective _revs_limit of 1. However, that bug by itself doesn't introduce duplicate hashes in the tree. The duplicates are introduced when the attachments are removed from all leafs of one of these documents (possibly by deleting the revisions) and a replication is triggered. At that point the replicator uses a different write path unaffected by the COUCHDB-885 bug and transfers the full revision paths to the target. The target database should have figured out how to merge the various disconnected branches back together into a single tree at that point, but it failed. That's the bug we need to address here. It appears that the fix is to sort the paths by starting revision before merging them back together. After the eventual patch for this ticket is applied users should be able to repair affected documents by updating any leaf revision.
          Hide
          Robert Newson added a comment -

          Blocker for 1.0.3 and 1.1.0.

          Show
          Robert Newson added a comment - Blocker for 1.0.3 and 1.1.0.
          Hide
          Robert Newson added a comment -

          Both cases perform a direct rev lookup and both return empty when they shouldn't, probably the same cause.

          Show
          Robert Newson added a comment - Both cases perform a direct rev lookup and both return empty when they shouldn't, probably the same cause.
          Hide
          Clare Walsh added a comment -

          Adam,

          Thanks for the info
          Yes, we were running 1.0.1 when this all started... then upgraded to 1.0.2 partway through apparently, in an attempt to make the horror stop. But in order to try to replicate/debug the problem while cleaning up our original system we copied the db file and mounted it to various dev environments. I'm running trunk svn from Friday, James is running 1.0.2
          Any other information you need? Any fix I can try? Or logging I can put in to help more?

          Thanks
          Clare

          Show
          Clare Walsh added a comment - Adam, Thanks for the info Yes, we were running 1.0.1 when this all started... then upgraded to 1.0.2 partway through apparently, in an attempt to make the horror stop. But in order to try to replicate/debug the problem while cleaning up our original system we copied the db file and mounted it to various dev environments. I'm running trunk svn from Friday, James is running 1.0.2 Any other information you need? Any fix I can try? Or logging I can put in to help more? Thanks Clare
          Hide
          Adam Kocoloski added a comment -

          Clare, thanks for the logging. It confirms my theory about the divergent responses – revision "10-df4bf65a6104ea240f100c30d3cb245d" shows up twice in the tree, once with a ?REV_MISSING value and once with a normal value that includes the pointer to the document on disk. A direct lookup of "10-df4bf65a6104ea240f100c30d3cb245d" yields the ?REV_MISSING value, but if we sort the tree and pick the winner (as we do in a simple document lookup) we end up with the real value.

          Of course, keys in the hash tree are supposed to be unique; we need to figure out why this is not the case in your setup. It looks like you're running a recent trunk build with a pre-existing database. Can you confirm?

          Show
          Adam Kocoloski added a comment - Clare, thanks for the logging. It confirms my theory about the divergent responses – revision "10-df4bf65a6104ea240f100c30d3cb245d" shows up twice in the tree, once with a ?REV_MISSING value and once with a normal value that includes the pointer to the document on disk. A direct lookup of "10-df4bf65a6104ea240f100c30d3cb245d" yields the ?REV_MISSING value, but if we sort the tree and pick the winner (as we do in a simple document lookup) we end up with the real value. Of course, keys in the hash tree are supposed to be unique; we need to figure out why this is not the case in your setup. It looks like you're running a recent trunk build with a pre-existing database. Can you confirm?
          Hide
          James Howe added a comment -

          I'm also trying to get a simple reproducible case, here are further details of our setup at the time these broken documents turned up.

          8 couches, with each couch replicating to 3 others (continuous remote-remote replication with trivial filters).
          Validators present for all classes of document.
          Attachments being added to existing documents, creating conflicts (see COUCHDB-885).
          Due to a bug on our end, a lot of documents were updated on every couch at the same revision, repeatedly, causing lots more conflicts.
          At the same time, every 30 seconds, we queried for all conflicts (using a view with doc._conflicts) and did a bulk_docs POST for each doc, performing a no-change update to the deterministic couch winner, and deleting all others (i.e. {_id: foo, _rev: bar, _deleted: true}).

          This lasted for no more than a day after which we started noticing all kinds of things going wrong (replication getting stuck, documents that are impossible to update or delete, etc.)

          We're not in a position to run this exact setup again until we are certain corruption will not occur.

          Show
          James Howe added a comment - I'm also trying to get a simple reproducible case, here are further details of our setup at the time these broken documents turned up. 8 couches, with each couch replicating to 3 others (continuous remote-remote replication with trivial filters). Validators present for all classes of document. Attachments being added to existing documents, creating conflicts (see COUCHDB-885 ). Due to a bug on our end, a lot of documents were updated on every couch at the same revision, repeatedly, causing lots more conflicts. At the same time, every 30 seconds, we queried for all conflicts (using a view with doc._conflicts) and did a bulk_docs POST for each doc, performing a no-change update to the deterministic couch winner, and deleting all others (i.e. {_id: foo, _rev: bar, _deleted: true}). This lasted for no more than a day after which we started noticing all kinds of things going wrong (replication getting stuck, documents that are impossible to update or delete, etc.) We're not in a position to run this exact setup again until we are certain corruption will not occur.
          Hide
          Bob Dionne added a comment -

          Thanks Clare, this helps a lot, I suspect it's the combination of both and the presence of attachments. I'm hopeful to generate a reproducible case

          Show
          Bob Dionne added a comment - Thanks Clare, this helps a lot, I suspect it's the combination of both and the presence of attachments. I'm hopeful to generate a reproducible case
          Hide
          Clare Walsh added a comment - - edited

          Bob,

          There was an awful lot of replication back and forth (remote to remote, see COUCHDB-885 ), and a large amount of conflicts due to that issue I believe.

          The only thing we did using bulk edits was in fact the deletion of losers of conflicts... by setting _deleted=true... in case that helps...

          Clare

          Show
          Clare Walsh added a comment - - edited Bob, There was an awful lot of replication back and forth (remote to remote, see COUCHDB-885 ), and a large amount of conflicts due to that issue I believe. The only thing we did using bulk edits was in fact the deletion of losers of conflicts... by setting _deleted=true... in case that helps... Clare
          Hide
          Bob Dionne added a comment -

          Clare,

          Can you possibly say more about the scenario? Have there been lots of replications back and forth? Were the deletes done thru bulk edits? I'm trying to create a small reproducible test.

          Bob

          Show
          Bob Dionne added a comment - Clare, Can you possibly say more about the scenario? Have there been lots of replications back and forth? Were the deletes done thru bulk edits? I'm trying to create a small reproducible test. Bob
          Hide
          Clare Walsh added a comment - - edited

          Have attached the logged and returned data as requested and both input and output seem to be identical for the calls to [2] and [3].
          I can confirm it definitely goes into case [1] as we already had logging in here, and at [2] also (but not at [3]).
          If you want the same logging but with change [1] undone (to see the original 404) then let me know... or any other changes or logs...
          Or if you have a handy way to make the output nicer (am quite new to erlang so am using ~w since ~p seemed to produce gibberish and they're the only 2 I've learnt so far :s )...
          Thanks for the help so far

          Show
          Clare Walsh added a comment - - edited Have attached the logged and returned data as requested and both input and output seem to be identical for the calls to [2] and [3] . I can confirm it definitely goes into case [1] as we already had logging in here, and at [2] also (but not at [3] ). If you want the same logging but with change [1] undone (to see the original 404) then let me know... or any other changes or logs... Or if you have a handy way to make the output nicer (am quite new to erlang so am using ~w since ~p seemed to produce gibberish and they're the only 2 I've learnt so far :s )... Thanks for the help so far
          Hide
          Clare Walsh added a comment -

          Output from return changes and logging changes requested by Paul in his comment

          Show
          Clare Walsh added a comment - Output from return changes and logging changes requested by Paul in his comment
          Hide
          Paul Joseph Davis added a comment -

          Also, can you log the RevTree variable in both of those calls? A hypothesis has been floated about the possibility of a diabolical malfunction in the key tree.

          Show
          Paul Joseph Davis added a comment - Also, can you log the RevTree variable in both of those calls? A hypothesis has been floated about the possibility of a diabolical malfunction in the key tree.
          Hide
          Paul Joseph Davis added a comment -

          Me and Adam are poking through the code and noticed something unusual. Can you change the return value at [1] to be something like

          {not_found, rev_missing}

          or similar to see if that's the piece of code indicating that the revision is missing?

          If that shows up, then could you log the input and output of the two calls at [2] and [3] to check that the behavior is the same in both places (we treat the return value slightly differently).

          [1] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1056
          [2] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1048
          [3] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1085

          Show
          Paul Joseph Davis added a comment - Me and Adam are poking through the code and noticed something unusual. Can you change the return value at [1] to be something like {not_found, rev_missing} or similar to see if that's the piece of code indicating that the revision is missing? If that shows up, then could you log the input and output of the two calls at [2] and [3] to check that the behavior is the same in both places (we treat the return value slightly differently). [1] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1056 [2] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1048 [3] https://github.com/apache/couchdb/blob/1.0.x/src/couchdb/couch_db.erl#L1085
          Hide
          Paul Joseph Davis added a comment -

          You'll want to look at the input and output of couch_key_tree:merge/3 as that's a likely place where this sort of thing would occur.

          Show
          Paul Joseph Davis added a comment - You'll want to look at the input and output of couch_key_tree:merge/3 as that's a likely place where this sort of thing would occur.
          Hide
          Clare Walsh added a comment -

          [Same cluster, different dev]
          As far as I can tell this is happening due to an invalid document state. All revisions in open_revs can be retrieved by rev except the current winner. But interestingly, any revision (apart from 1 and 2) returned from revs=true for the current winner ALSO can't be retrieved by rev.
          No matter how far down into the couch code I go to add debug code the value in the btree for those revisions is genuinely empty...

          This would imply that the problem is initially caused by incorrect conflict handling or replication, going to try to debug those bits next, any pointers would be much appreciated.

          Show
          Clare Walsh added a comment - [Same cluster, different dev] As far as I can tell this is happening due to an invalid document state. All revisions in open_revs can be retrieved by rev except the current winner. But interestingly, any revision (apart from 1 and 2) returned from revs=true for the current winner ALSO can't be retrieved by rev. No matter how far down into the couch code I go to add debug code the value in the btree for those revisions is genuinely empty... This would imply that the problem is initially caused by incorrect conflict handling or replication, going to try to debug those bits next, any pointers would be much appreciated.

            People

            • Assignee:
              Robert Newson
              Reporter:
              James Howe
            • Votes:
              2 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development