Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-3234

Track open shard timeouts with a counter instead of logging

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Database Core
    • Labels:
      None

      Description

      Fabric uses the open_shard RPC method to get security objects for every request. These calls have very short timeouts on them which can cause massive amounts of log spam when a node is under load. Rather than log a whole bunch of garbage when each one fails lets just use a counter instead.

      PR incoming

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user davisp opened a pull request:

          https://github.com/apache/couchdb-fabric/pull/74

          Track open_shard timeouts with a counter

          The open_shard RPC endpoint is used to grab security docs. There are
          fairly aggressive timeouts on these requests so that when a node is too
          busy it'll try the next shard. Rather than log everytime these fail
          (which can be substantial under load) lets just use a counter that can
          be graphed and alerted on.

          COUCHDB-3234

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/cloudant/couchdb-fabric COUCHDB-3234-open-shard-timeout-counter

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/couchdb-fabric/pull/74.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #74


          commit 4fcfecf720791f400169c6823aac1abc2d85c47e
          Author: Paul J. Davis <paul.joseph.davis@gmail.com>
          Date: 2016-11-11T17:34:11Z

          Track open_shard timeouts with a counter

          The open_shard RPC endpoint is used to grab security docs. There are
          fairly aggressive timeouts on these requests so that when a node is too
          busy it'll try the next shard. Rather than log everytime these fail
          (which can be substantial under load) lets just use a counter that can
          be graphed and alerted on.

          COUCHDB-3234


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user davisp opened a pull request: https://github.com/apache/couchdb-fabric/pull/74 Track open_shard timeouts with a counter The open_shard RPC endpoint is used to grab security docs. There are fairly aggressive timeouts on these requests so that when a node is too busy it'll try the next shard. Rather than log everytime these fail (which can be substantial under load) lets just use a counter that can be graphed and alerted on. COUCHDB-3234 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-fabric COUCHDB-3234 -open-shard-timeout-counter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-fabric/pull/74.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #74 commit 4fcfecf720791f400169c6823aac1abc2d85c47e Author: Paul J. Davis <paul.joseph.davis@gmail.com> Date: 2016-11-11T17:34:11Z Track open_shard timeouts with a counter The open_shard RPC endpoint is used to grab security docs. There are fairly aggressive timeouts on these requests so that when a node is too busy it'll try the next shard. Rather than log everytime these fail (which can be substantial under load) lets just use a counter that can be graphed and alerted on. COUCHDB-3234
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 4fcfecf720791f400169c6823aac1abc2d85c47e in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=4fcfecf ]

          Track open_shard timeouts with a counter

          The open_shard RPC endpoint is used to grab security docs. There are
          fairly aggressive timeouts on these requests so that when a node is too
          busy it'll try the next shard. Rather than log everytime these fail
          (which can be substantial under load) lets just use a counter that can
          be graphed and alerted on.

          COUCHDB-3234

          Show
          jira-bot ASF subversion and git services added a comment - Commit 4fcfecf720791f400169c6823aac1abc2d85c47e in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=4fcfecf ] Track open_shard timeouts with a counter The open_shard RPC endpoint is used to grab security docs. There are fairly aggressive timeouts on these requests so that when a node is too busy it'll try the next shard. Rather than log everytime these fail (which can be substantial under load) lets just use a counter that can be graphed and alerted on. COUCHDB-3234
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 67978b140acc4a3afdb3679b47e989c5b0caf194 in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=67978b1 ]

          Merge branch 'COUCHDB-3234-open-shard-timeout-counter'

          Show
          jira-bot ASF subversion and git services added a comment - Commit 67978b140acc4a3afdb3679b47e989c5b0caf194 in couchdb-fabric's branch refs/heads/master from Paul Joseph Davis [ https://git-wip-us.apache.org/repos/asf?p=couchdb-fabric.git;h=67978b1 ] Merge branch ' COUCHDB-3234 -open-shard-timeout-counter'
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/couchdb-fabric/pull/74

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/couchdb-fabric/pull/74
          Hide
          paul.joseph.davis Paul Joseph Davis added a comment -

          Merged.

          Show
          paul.joseph.davis Paul Joseph Davis added a comment - Merged.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit f00b2ecab5c699b87dca92c4248ecffe6a16e7f1 in couchdb's branch refs/heads/master from Eric Avdey
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=f00b2ec ]

          Update dependencies

          couch_epi f6ad55..60e7f8

          • Merge remote branch 'DeadZen:patch-1'
          • Update README.md

          documentation 52a287..59a887

          • Spelling error fix: fauxuton to fauxton
          • Document stable and update query parameters
          • Fixing type error
          • Tiny 127 address fix
          • More command cleanup
          • Fixed the transposition of the version or erlang
          • Fix Fauxton docs

          chttpd 3dcdb6..7bfd25

          • Merge remote branch 'cloudant:78077-pass-user_ctx_to_filter'
          • Include user_ctx in db open options
          • Merge remote branch 'cloudant:71810-handle-errors-terms-from-fabric'
          • Handle error terms from fabric
          • Merge default update response headers with custom ones

          fabric 7cfabb..205064

          • Merge branch 'COUCHDB-3234-open-shard-timeout-counter'
          • Track open_shard timeouts with a counter
          • Merge remote-tracking branch 'cloudant/3232-all-docs-ctx'
          • Pass user_ctx down to fabric_rpc
          • Merge remote branch 'cloudant:77984-upgrade-mrargs-record-phase2'
          • Revert "Revert "Merge remote-tracking branch
            'banjiewen/stale-stable-update'""
          • Upgrade #mrargs{} record
          • Merge remote branch 'cloudant:77984-upgrade-mrargs-record-phase1'
          • Compatibility clause for the record upgrade
          • Revert "Merge remote-tracking branch 'banjiewen/stale-stable-update'"
          • Merge remote branch 'cloudant:fix-typespecs'
          • Add ` {error, Reason}

            ` to typespecs

          • Merge remote branch 'cloudant:77984-fixup'
          • Use upgraded #mrargs{} instead of old one

          mem3 252467..c3c542

          • Merge remote branch 'cloudant:79066-port-chunkified-replicate_batch'
          • Chunk missing revisions before attempting to save on target

          couch_mrview 589943..15a1ae

          • Merge remote branch 'cloudant:77867-convert-pid-to-binary'
          • Convert pid of indexer to binary

          mango 50066b..4afd60

          • Add config parameter to reject index all text indexes

          couch 1659fd..b83f1a

          • Merge branch '3251-remove-filename-rootname'
          • Remove use of filename:rootname/1
          Show
          jira-bot ASF subversion and git services added a comment - Commit f00b2ecab5c699b87dca92c4248ecffe6a16e7f1 in couchdb's branch refs/heads/master from Eric Avdey [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=f00b2ec ] Update dependencies couch_epi f6ad55..60e7f8 Merge remote branch 'DeadZen:patch-1' Update README.md documentation 52a287..59a887 Spelling error fix: fauxuton to fauxton Document stable and update query parameters Fixing type error Tiny 127 address fix More command cleanup Fixed the transposition of the version or erlang Fix Fauxton docs chttpd 3dcdb6..7bfd25 Merge remote branch 'cloudant:78077-pass-user_ctx_to_filter' Include user_ctx in db open options Merge remote branch 'cloudant:71810-handle-errors-terms-from-fabric' Handle error terms from fabric Merge default update response headers with custom ones fabric 7cfabb..205064 Merge branch ' COUCHDB-3234 -open-shard-timeout-counter' Track open_shard timeouts with a counter Merge remote-tracking branch 'cloudant/3232-all-docs-ctx' Pass user_ctx down to fabric_rpc Merge remote branch 'cloudant:77984-upgrade-mrargs-record-phase2' Revert "Revert "Merge remote-tracking branch 'banjiewen/stale-stable-update'"" Upgrade #mrargs{} record Merge remote branch 'cloudant:77984-upgrade-mrargs-record-phase1' Compatibility clause for the record upgrade Revert "Merge remote-tracking branch 'banjiewen/stale-stable-update'" Merge remote branch 'cloudant:fix-typespecs' Add ` {error, Reason} ` to typespecs Merge remote branch 'cloudant:77984-fixup' Use upgraded #mrargs{} instead of old one mem3 252467..c3c542 Merge remote branch 'cloudant:79066-port-chunkified-replicate_batch' Chunk missing revisions before attempting to save on target couch_mrview 589943..15a1ae Merge remote branch 'cloudant:77867-convert-pid-to-binary' Convert pid of indexer to binary mango 50066b..4afd60 Add config parameter to reject index all text indexes couch 1659fd..b83f1a Merge branch '3251-remove-filename-rootname' Remove use of filename:rootname/1

            People

            • Assignee:
              Unassigned
              Reporter:
              paul.joseph.davis Paul Joseph Davis
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development