Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-3178

Fabric does not send message when filtering lots of documents

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Database Core
    • Labels:
      None

      Description

      We managed to mess up part of the fabric merge where fabric_rpc workers that are running filter changes end up not sending a message for long periods of time if no documents are passing the filter. PR Incoming.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user davisp opened a pull request:

          https://github.com/apache/couchdb-fabric/pull/72

          Send a message when filtering a changes row

          We managed to miss this change during the great merge as it was a
          confusing mess in the cloudant/fabric repository. Its obvious in
          hindsight once you see that we have a
          `fabric_view_changes:handle_message/3` clause to handle the message.

          COUCHDB-3178

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/cloudant/couchdb-fabric 3178-fix-fabric-rpc-filtered-changes

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/couchdb-fabric/pull/72.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #72


          commit 6006bc6c8e491456c186950cdc12257036142b16
          Author: Paul J. Davis <paul.joseph.davis@gmail.com>
          Date: 2016-10-04T20:12:38Z

          Send a message when filtering a changes row

          We managed to miss this change during the great merge as it was a
          confusing mess in the cloudant/fabric repository. Its obvious in
          hindsight once you see that we have a
          `fabric_view_changes:handle_message/3` clause to handle the message.

          COUCHDB-3178


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user davisp opened a pull request: https://github.com/apache/couchdb-fabric/pull/72 Send a message when filtering a changes row We managed to miss this change during the great merge as it was a confusing mess in the cloudant/fabric repository. Its obvious in hindsight once you see that we have a `fabric_view_changes:handle_message/3` clause to handle the message. COUCHDB-3178 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-fabric 3178-fix-fabric-rpc-filtered-changes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-fabric/pull/72.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #72 commit 6006bc6c8e491456c186950cdc12257036142b16 Author: Paul J. Davis <paul.joseph.davis@gmail.com> Date: 2016-10-04T20:12:38Z Send a message when filtering a changes row We managed to miss this change during the great merge as it was a confusing mess in the cloudant/fabric repository. Its obvious in hindsight once you see that we have a `fabric_view_changes:handle_message/3` clause to handle the message. COUCHDB-3178
          Hide
          paul.joseph.davis Paul Joseph Davis added a comment -

          I should note, that if you have a replication with a filter that's constantly timing out, this is likely the cause. Also, if you have that replication as a replicator doc, we're seeing a large amount of load on various nodes because the couchjs process count is much higher as we're filtering a whole bunch of docs repeatedly because replications are retried by the replication manager. So, while it seems like a small fix it should actually have a fairly sizable impact on cluster performance and resource usage. I'll update more once I've learned more.

          Show
          paul.joseph.davis Paul Joseph Davis added a comment - I should note, that if you have a replication with a filter that's constantly timing out, this is likely the cause. Also, if you have that replication as a replicator doc, we're seeing a large amount of load on various nodes because the couchjs process count is much higher as we're filtering a whole bunch of docs repeatedly because replications are retried by the replication manager. So, while it seems like a small fix it should actually have a fairly sizable impact on cluster performance and resource usage. I'll update more once I've learned more.
          Hide
          paul.joseph.davis Paul Joseph Davis added a comment -

          Yeap. That fixed it. Kind of amazing how something like that can have such a profound impact on the system. For background, what would happen is that when we got a call to the clustered _changes endpoint, we'd fire off RPC workers for each shard and wait to hear back from them. Which we never did so we'd timeout.

          However, the rpc workers were still furiously looking for docs that passed the filter which was just wasting resources since their coordinator had already abandoned them.

          So now filtered changes feeds work again when they have to filter lots of rows (once we merge the PR and get it into a relase).

          Show
          paul.joseph.davis Paul Joseph Davis added a comment - Yeap. That fixed it. Kind of amazing how something like that can have such a profound impact on the system. For background, what would happen is that when we got a call to the clustered _changes endpoint, we'd fire off RPC workers for each shard and wait to hear back from them. Which we never did so we'd timeout. However, the rpc workers were still furiously looking for docs that passed the filter which was just wasting resources since their coordinator had already abandoned them. So now filtered changes feeds work again when they have to filter lots of rows (once we merge the PR and get it into a relase).
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 6006bc6c8e491456c186950cdc12257036142b16 in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=6006bc6 ]

          Send a message when filtering a changes row

          We managed to miss this change during the great merge as it was a
          confusing mess in the cloudant/fabric repository. Its obvious in
          hindsight once you see that we have a
          `fabric_view_changes:handle_message/3` clause to handle the message.

          COUCHDB-3178

          Show
          jira-bot ASF subversion and git services added a comment - Commit 6006bc6c8e491456c186950cdc12257036142b16 in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=6006bc6 ] Send a message when filtering a changes row We managed to miss this change during the great merge as it was a confusing mess in the cloudant/fabric repository. Its obvious in hindsight once you see that we have a `fabric_view_changes:handle_message/3` clause to handle the message. COUCHDB-3178
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/couchdb-fabric/pull/72

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/couchdb-fabric/pull/72

            People

            • Assignee:
              Unassigned
              Reporter:
              paul.joseph.davis Paul Joseph Davis
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development