Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-3291

Excessively long document IDs prevent replicator from making progress

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently there is not protection in couchdb from creating IDs which are too long. So large IDs will hit various implicit limits which usually results in unpredictable failure modes.

      On such example implicit limit is hit in the replicator code. Replicate usually fetches document IDs in a bulk-like call either gets them via changes feed, computes revs_diffs in a post or inserts them with bulk_docs, except one case when it fetch open_revs. There it uses a single GET request. That requests fails because there is a bug / limitation in the http parser. The first GET line in the http request has to fit in the receive buffer for the receiving socket.

      Increasing that buffer allow passing through larger http requests lines. In configuration options it can be manipulated as

       chttpd.server_options="[...,{recbuf, 32768},...]"
      

      Steve Vinoski mentions something about a possible bug in http packet parser code as well:

      http://erlang.org/pipermail/erlang-questions/2011-June/059567.html

      Tracing this a bit I see that a proper mochiweb request is never even created and instead request hangs. So that confirms it further. It seems in the code here:

      https://github.com/apache/couchdb-mochiweb/blob/bd6ae7cbb371666a1f68115056f7b30d13765782/src/mochiweb_http.erl#L90

      The timeout clause is hit. Adding a catchall exception I get the

      {tcp_error,#Port<0.40682>,emsgsize}

      message which we don't handle. Seems like a sane place to throw a 413 or such there.

      There are probably multiple ways to address the issue:

      • Increase mochiweb listener buffer to fit larger doc ids. However that is a separate bug and using it to control document size during replication is not reliable. Moreover that would allow larger IDs to propagate through the system during replication, then would have to configure all future replication source with the same maximum recbuf value.
      • Introduce a validation step in
         couch_doc:validate_docid 

        . Currently that code doesn't read from config files and is in the hotpath. Added a config read in there might reduce performance. If that is enabled it would stop creating new documents with large ids. But have to decide how to handle already existing IDs which are larger than the limit.

      • Introduce a validation/bypass in the replicator. Specifically targeting replicator might help prevent propagation of large IDs during replication. There is a already a similar case of skipping writing large attachment or large documents (which exceed request size) and bumping
         doc_write_failures 

        .

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user nickva opened a pull request:

          https://github.com/apache/couchdb-couch-replicator/pull/54

          Allow configuring maximum document ID length during replication

          Currently due to a bug in http parser and lack of document ID length
          enforcement, large document IDs will break replication jobs. Large IDs
          will pass through the _change feed, revs diffs, but then fail
          during open_revs get request. open_revs request will keep retrying until
          it gives up after long enough time, then replication task crashes and
          restart again with the same pattern. The current effective limit is
          around 8k or so. (The buffer size default 8192 and if the first line
          of the request is larger than that, request will fail).

          (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html
          for more information about the possible failure mechanism).

          Bypassing the parser bug by increasing recbuf size, will alow replication
          to finish, however that means simply spreading the abnormal document through
          the rest of the system, and might not be desirable always.

          Also once long document IDs have been inserted in the source DB. Simply deleting
          them doesn't work as they'd still appear in the change feed. They'd have to
          be purged or somehow skipped during the replication step. This commit helps
          do the later.

          Operators can configure maximum length via this setting:
          ```
          replicator.max_document_id_length=0
          ```

          The default value is 0 which means there is no maximum enforced, which is
          backwards compatible behavior.

          During replication if maximum is hit by a document, that document is skipped,
          an error is written to the log:

          ```
          Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db `http://.../cdyno-0000001/` is too long, ignoring.
          ```

          and `"doc_write_failures"` statistic is bumped.

          COUCHDB-3291

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-limit-doc-id-size-in-replicator

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/couchdb-couch-replicator/pull/54.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #54


          commit 3ff2d83893481afd68025a52a6d859a2efaf0bcf
          Author: Nick Vatamaniuc <vatamane@apache.org>
          Date: 2017-02-03T23:00:37Z

          Allow configuring maximum document ID length during replication

          Currently due to a bug in http parser and lack of document ID length
          enforcement, large document IDs will break replication jobs. Large IDs
          will pass through the _change feed, revs diffs, but then fail
          during open_revs get request. open_revs request will keep retrying until
          it gives up after long enough time, then replication task crashes and
          restart again with the same pattern. The current effective limit is
          around 8k or so. (The buffer size default 8192 and if the first line
          of the request is larger than that, request will fail).

          (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html
          for more information about the possible failure mechanism).

          Bypassing the parser bug by increasing recbuf size, will alow replication
          to finish, however that means simply spreading the abnormal document through
          the rest of the system, and might not be desirable always.

          Also once long document IDs have been inserted in the source DB. Simply deleting
          them doesn't work as they'd still appear in the change feed. They'd have to
          be purged or somehow skipped during the replication step. This commit helps
          do the later.

          Operators can configure maximum length via this setting:
          ```
          replicator.max_document_id_length=0
          ```

          The default value is 0 which means there is no maximum enforced, which is
          backwards compatible behavior.

          During replication if maximum is hit by a document, that document is skipped,
          an error is written to the log:

          ```
          Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db `http://.../cdyno-0000001/` is too long, ignoring.
          ```

          and `"doc_write_failures"` statistic is bumped.

          COUCHDB-3291


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/54 Allow configuring maximum document ID length during replication Currently due to a bug in http parser and lack of document ID length enforcement, large document IDs will break replication jobs. Large IDs will pass through the _change feed, revs diffs, but then fail during open_revs get request. open_revs request will keep retrying until it gives up after long enough time, then replication task crashes and restart again with the same pattern. The current effective limit is around 8k or so. (The buffer size default 8192 and if the first line of the request is larger than that, request will fail). (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html for more information about the possible failure mechanism). Bypassing the parser bug by increasing recbuf size, will alow replication to finish, however that means simply spreading the abnormal document through the rest of the system, and might not be desirable always. Also once long document IDs have been inserted in the source DB. Simply deleting them doesn't work as they'd still appear in the change feed. They'd have to be purged or somehow skipped during the replication step. This commit helps do the later. Operators can configure maximum length via this setting: ``` replicator.max_document_id_length=0 ``` The default value is 0 which means there is no maximum enforced, which is backwards compatible behavior. During replication if maximum is hit by a document, that document is skipped, an error is written to the log: ``` Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db ` http://.../cdyno-0000001/ ` is too long, ignoring. ``` and `"doc_write_failures"` statistic is bumped. COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-limit-doc-id-size-in-replicator Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/54.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #54 commit 3ff2d83893481afd68025a52a6d859a2efaf0bcf Author: Nick Vatamaniuc <vatamane@apache.org> Date: 2017-02-03T23:00:37Z Allow configuring maximum document ID length during replication Currently due to a bug in http parser and lack of document ID length enforcement, large document IDs will break replication jobs. Large IDs will pass through the _change feed, revs diffs, but then fail during open_revs get request. open_revs request will keep retrying until it gives up after long enough time, then replication task crashes and restart again with the same pattern. The current effective limit is around 8k or so. (The buffer size default 8192 and if the first line of the request is larger than that, request will fail). (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html for more information about the possible failure mechanism). Bypassing the parser bug by increasing recbuf size, will alow replication to finish, however that means simply spreading the abnormal document through the rest of the system, and might not be desirable always. Also once long document IDs have been inserted in the source DB. Simply deleting them doesn't work as they'd still appear in the change feed. They'd have to be purged or somehow skipped during the replication step. This commit helps do the later. Operators can configure maximum length via this setting: ``` replicator.max_document_id_length=0 ``` The default value is 0 which means there is no maximum enforced, which is backwards compatible behavior. During replication if maximum is hit by a document, that document is skipped, an error is written to the log: ``` Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db ` http://.../cdyno-0000001/ ` is too long, ignoring. ``` and `"doc_write_failures"` statistic is bumped. COUCHDB-3291
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit d23025ebd7176f6c307ddf49902cf20b33bd55c4 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=d23025e ]

          Allow configuring maximum document ID length during replication

          Currently due to a bug in http parser and lack of document ID length
          enforcement, large document IDs will break replication jobs. Large IDs
          will pass through the _change feed, revs diffs, but then fail
          during open_revs get request. open_revs request will keep retrying until
          it gives up after long enough time, then replication task crashes and
          restart again with the same pattern. The current effective limit is
          around 8k or so. (The buffer size default 8192 and if the first line
          of the request is larger than that, request will fail).

          (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html
          for more information about the possible failure mechanism).

          Bypassing the parser bug by increasing recbuf size, will alow replication
          to finish, however that means simply spreading the abnormal document through
          the rest of the system, and might not be desirable always.

          Also once long document IDs have been inserted in the source DB. Simply deleting
          them doesn't work as they'd still appear in the change feed. They'd have to
          be purged or somehow skipped during the replication step. This commit helps
          do the later.

          Operators can configure maximum length via this setting:
          ```
          replicator.max_document_id_length=0
          ```

          The default value is 0 which means there is no maximum enforced, which is
          backwards compatible behavior.

          During replication if maximum is hit by a document, that document is skipped,
          an error is written to the log:

          ```
          Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db `http://.../cdyno-0000001/` is too long, ignoring.
          ```

          and `"doc_write_failures"` statistic is bumped.

          COUCHDB-3291

          Show
          jira-bot ASF subversion and git services added a comment - Commit d23025ebd7176f6c307ddf49902cf20b33bd55c4 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=d23025e ] Allow configuring maximum document ID length during replication Currently due to a bug in http parser and lack of document ID length enforcement, large document IDs will break replication jobs. Large IDs will pass through the _change feed, revs diffs, but then fail during open_revs get request. open_revs request will keep retrying until it gives up after long enough time, then replication task crashes and restart again with the same pattern. The current effective limit is around 8k or so. (The buffer size default 8192 and if the first line of the request is larger than that, request will fail). (See http://erlang.org/pipermail/erlang-questions/2011-June/059567.html for more information about the possible failure mechanism). Bypassing the parser bug by increasing recbuf size, will alow replication to finish, however that means simply spreading the abnormal document through the rest of the system, and might not be desirable always. Also once long document IDs have been inserted in the source DB. Simply deleting them doesn't work as they'd still appear in the change feed. They'd have to be purged or somehow skipped during the replication step. This commit helps do the later. Operators can configure maximum length via this setting: ``` replicator.max_document_id_length=0 ``` The default value is 0 which means there is no maximum enforced, which is backwards compatible behavior. During replication if maximum is hit by a document, that document is skipped, an error is written to the log: ``` Replicator: document id `aaaaaaaaaaaaaaaaaaaaa...` from source db ` http://.../cdyno-0000001/ ` is too long, ignoring. ``` and `"doc_write_failures"` statistic is bumped. COUCHDB-3291
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/couchdb-couch-replicator/pull/54

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/couchdb-couch-replicator/pull/54
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 8ca9106fe84c938fcfff161635fd16cf39956b95 in couchdb's branch refs/heads/master from Nick Vatamaniuc
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=8ca9106 ]

          Bump replicator dependency

          COUCHDB-3291

          Show
          jira-bot ASF subversion and git services added a comment - Commit 8ca9106fe84c938fcfff161635fd16cf39956b95 in couchdb's branch refs/heads/master from Nick Vatamaniuc [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=8ca9106 ] Bump replicator dependency COUCHDB-3291
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user nickva opened a pull request:

          https://github.com/apache/couchdb-couch-replicator/pull/55

          Switch replicator max_document_id_length config to use infinity

          Default value switched to be `infinity` instead of 0

          COUCHDB-3291

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-use-infinity

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/couchdb-couch-replicator/pull/55.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #55


          commit 46f70c73427e618774872a388287ba682c1376f1
          Author: Nick Vatamaniuc <vatamane@apache.org>
          Date: 2017-02-08T16:46:13Z

          Switch replicator max_document_id_length config to use infinity

          Default value switched to be `infinity` instead of 0

          COUCHDB-3291


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/55 Switch replicator max_document_id_length config to use infinity Default value switched to be `infinity` instead of 0 COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-use-infinity Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/55.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #55 commit 46f70c73427e618774872a388287ba682c1376f1 Author: Nick Vatamaniuc <vatamane@apache.org> Date: 2017-02-08T16:46:13Z Switch replicator max_document_id_length config to use infinity Default value switched to be `infinity` instead of 0 COUCHDB-3291
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user nickva opened a pull request:

          https://github.com/apache/couchdb-couch-replicator/pull/56

          Use string formatting to shorten document ID during logging.

          Previously used an explicit lists:sublist call but value was never used
          anywhere besides the log message.

          COUCHDB-3291

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-better-formatting

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/couchdb-couch-replicator/pull/56.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #56


          commit c306fab27dbd88d8ecc8f60fb5ec04e7911fd786
          Author: Nick Vatamaniuc <vatamane@apache.org>
          Date: 2017-02-08T17:02:34Z

          Use string formatting to shorten document ID during logging.

          Previously used an explicit lists:sublist call but value was never used
          anywhere besides the log message.

          COUCHDB-3291


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user nickva opened a pull request: https://github.com/apache/couchdb-couch-replicator/pull/56 Use string formatting to shorten document ID during logging. Previously used an explicit lists:sublist call but value was never used anywhere besides the log message. COUCHDB-3291 You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3291-better-formatting Alternatively you can review and apply these changes as the patch at: https://github.com/apache/couchdb-couch-replicator/pull/56.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #56 commit c306fab27dbd88d8ecc8f60fb5ec04e7911fd786 Author: Nick Vatamaniuc <vatamane@apache.org> Date: 2017-02-08T17:02:34Z Use string formatting to shorten document ID during logging. Previously used an explicit lists:sublist call but value was never used anywhere besides the log message. COUCHDB-3291
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit c306fab27dbd88d8ecc8f60fb5ec04e7911fd786 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=c306fab ]

          Use string formatting to shorten document ID during logging.

          Previously used an explicit lists:sublist call but value was never used
          anywhere besides the log message.

          COUCHDB-3291

          Show
          jira-bot ASF subversion and git services added a comment - Commit c306fab27dbd88d8ecc8f60fb5ec04e7911fd786 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=c306fab ] Use string formatting to shorten document ID during logging. Previously used an explicit lists:sublist call but value was never used anywhere besides the log message. COUCHDB-3291
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/couchdb-couch-replicator/pull/56

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/couchdb-couch-replicator/pull/56
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 46f70c73427e618774872a388287ba682c1376f1 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=46f70c7 ]

          Switch replicator max_document_id_length config to use infinity

          Default value switched to be `infinity` instead of 0

          COUCHDB-3291

          Show
          jira-bot ASF subversion and git services added a comment - Commit 46f70c73427e618774872a388287ba682c1376f1 in couchdb-couch-replicator's branch refs/heads/master from Nick Vatamaniuc [ https://git-wip-us.apache.org/repos/asf?p=couchdb-couch-replicator.git;h=46f70c7 ] Switch replicator max_document_id_length config to use infinity Default value switched to be `infinity` instead of 0 COUCHDB-3291
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/couchdb-couch-replicator/pull/55

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/couchdb-couch-replicator/pull/55
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 41d71ab93d663c24e3efcb8f08332fd8439c5dfb in couchdb's branch refs/heads/master from Nick Vatamaniuc
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=41d71ab ]

          Bump replicator dependency

          Fix default value for replicator max_document_id_length parameter

          Cleanup error logging code

          COUCHDB-3291

          Show
          jira-bot ASF subversion and git services added a comment - Commit 41d71ab93d663c24e3efcb8f08332fd8439c5dfb in couchdb's branch refs/heads/master from Nick Vatamaniuc [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=41d71ab ] Bump replicator dependency Fix default value for replicator max_document_id_length parameter Cleanup error logging code COUCHDB-3291

            People

            • Assignee:
              Unassigned
              Reporter:
              vatamane Nick Vatamaniuc
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development