CouchDB
  1. CouchDB
  2. COUCHDB-793

replication hangs (recent @dev thread)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.2, 0.11
    • Fix Version/s: 0.11.1, 1.0, 1.0.3
    • Component/s: Replication
    • Labels:
      None
    • Environment:

      trunk and 0.11 partially

      Description

      Following the recent @dev thread about the replication.js test hanging, the following patch fixes 2 causes for this:

      1) In couch_rep_reader, if the reader_loop process finishes before all monitored processes (which read docs from source db) finish, the couch_db_writer process will never receive the message

      {complete, HighSeq}

      . This happens more frequently for replication by doc_ids

      2) For trunk only, in couch_rep_writer, if we replicate a document with attachments and the first couch_rep_httpc upload (remote target db case) try doesn't succeed, subsequent tries will always fail because the couch_work_queue used was closed after the first try and the streaming function passed to ibrowse will always returns eof.

      The following patch fixes both problems. With it, running the replication test several times in a row (20+) succeeds, as well as all other JS and Etap tests.

      Reason 1) might be the cause for COUCHDB-596 as well.

        Issue Links

          Activity

          Hide
          Paul Bonser added a comment -

          I was affected by this issue pretty severely, with the freezes happening very frequently, and I can verify that this patch fixes the issue. I also had 20+ successful runs of the replication test in a row.

          Show
          Paul Bonser added a comment - I was affected by this issue pretty severely, with the freezes happening very frequently, and I can verify that this patch fixes the issue. I also had 20+ successful runs of the replication test in a row.
          Hide
          Filipe Manana added a comment -

          Thanks Paul.

          You did as well a very good analysis of the problem #1. It's not easy to find this type of deadlock issues.

          cheers

          Show
          Filipe Manana added a comment - Thanks Paul. You did as well a very good analysis of the problem #1. It's not easy to find this type of deadlock issues. cheers
          Hide
          Adam Kocoloski added a comment -

          This patch fixes the hanging for me, too.

          The fix for #1 should go into the 0.10 and 0.11 branches, too.

          Show
          Adam Kocoloski added a comment - This patch fixes the hanging for me, too. The fix for #1 should go into the 0.10 and 0.11 branches, too.
          Hide
          Adam Kocoloski added a comment -

          Patches applied to 0.10.x, 0.11.x, and trunk.

          Show
          Adam Kocoloski added a comment - Patches applied to 0.10.x, 0.11.x, and trunk.

            People

            • Assignee:
              Unassigned
              Reporter:
              Filipe Manana
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development