Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.0.2, 1.0.3
    • Fix Version/s: None
    • Component/s: Replication
    • Environment:

      CentOS 5.6 64 bit, XFS HDD drive. Spidermonkey 1.9.2 or 1.7

      Description

      We have a setup replicating 7 databases from a master to slave. 2 databases use filters. One of these databases (the infrequently updated one) is failing replication. We have a cronjob to poll replication once per minute, and these stack traces appear often in the logs.

      The network is a gigabit lan, or 2 vms on the same host (same result seen on both).
      The replication job is called by sshing into the target and then curling the source database to localhost
      Source -> Target

      ssh TargetServer 'curl -sX POST -H "content-type:application/json" http://localhost:5984/_replicate -d

      {"source":"http://SourceServer:5984/DataBase","target":"DataBase","continuous":true,"filter":"productionfilter/notProcessingJob"}

      '

      changes_timeout is not defined in the ini files.

      Logs attached for stack traces on the source couch and the target couch

        Issue Links

          Activity

          Alex Markham created issue -
          Hide
          Alex Markham added a comment -

          Logs snippets attached

          Show
          Alex Markham added a comment - Logs snippets attached
          Alex Markham made changes -
          Field Original Value New Value
          Attachment Couchdb Filtered replication source timeout .txt [ 12487311 ]
          Attachment Couchdb Filtered replication target timeout .txt [ 12487312 ]
          Hide
          Alex Markham added a comment -

          I should add that if I manually poll the changes url with the filter on using curl it seems to work fine (though not tested for long periods)
          http://SourceServer:5984/DataBase/_changes?filter=productionfilter/notProcessingJob&style=all_docs&heartbeat=10000&since=40034&feed=continuous

          Show
          Alex Markham added a comment - I should add that if I manually poll the changes url with the filter on using curl it seems to work fine (though not tested for long periods) http://SourceServer:5984/DataBase/_changes?filter=productionfilter/notProcessingJob&style=all_docs&heartbeat=10000&since=40034&feed=continuous
          Hide
          Robert Newson added a comment -

          The 'Reason for termination == changes_timeout' points at the internal use of the timer module rather than anything network related. I took a quick look at how the time is set (and cancelled) and it looks ok. It does appear to be reset if a heartbeat is received even if there's a filter.

          Show
          Robert Newson added a comment - The 'Reason for termination == changes_timeout' points at the internal use of the timer module rather than anything network related. I took a quick look at how the time is set (and cancelled) and it looks ok. It does appear to be reset if a heartbeat is received even if there's a filter.
          Hide
          Hans-D. Böhlau added a comment -

          We noticed exactly the same effect in our project (using couchdb 1.0.2). Using filtered replication to regulary update a second database is not stable a soon as we have some hundreds of documents in the source database. We see the same timeout entries in our log file.

          Requesting the (filtered) changes request manually shows, that it takes more and more time until the response is delivered - the more the number of documents (and/or changes) in the database increases.

          Maybe it's important: We have a lot of documents with attachments.

          Best regards,
          Hans

          Show
          Hans-D. Böhlau added a comment - We noticed exactly the same effect in our project (using couchdb 1.0.2). Using filtered replication to regulary update a second database is not stable a soon as we have some hundreds of documents in the source database. We see the same timeout entries in our log file. Requesting the (filtered) changes request manually shows, that it takes more and more time until the response is delivered - the more the number of documents (and/or changes) in the database increases. Maybe it's important: We have a lot of documents with attachments. Best regards, Hans
          Jan Lehnardt made changes -
          Link This issue duplicates COUCHDB-1289 [ COUCHDB-1289 ]
          Filipe Manana made changes -
          Link This issue is related to COUCHDB-1289 [ COUCHDB-1289 ]
          Hide
          Filipe Manana added a comment -

          This has a very good chance of being caused by COUCHDB-1289.

          Show
          Filipe Manana added a comment - This has a very good chance of being caused by COUCHDB-1289 .

            People

            • Assignee:
              Unassigned
              Reporter:
              Alex Markham
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Development