Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      There were two issues with mem3_shards that were fixed while I've been testing the PSE code.

      The first issue was found by Jay Doane where a database can have its shards inserted into the cache after its been deleted. This can happen if a client does a rapid CREATE/DELETE/GET cycle on a database. The fix for this is to track the changes feed update sequence from the changes feed listener and only insert shard maps that come from a client that has read as recent of an update_seq as mem3_shards.

      The second issue found during heavy benchmarking was that large shard maps (in the Q>=128 range) can quite easily cause mem3_shards to backup when there's a thundering herd attempting to open the database. There's no coordination among workers trying to add a shard map to the cache so if a bunch of independent clients all send the shard map at once (say, at the beginning of a benchmark) then mem3_shards can get overwhelmed. The fix for this was two fold. First, rather than send the shard map directly to mem3_shards, we copy it into a spawned process and when/if mem3_shards wants to write it, it tells this writer process to do its business. The second optimization for this change is to create an ets table to track these processes. Then independent clients can check if a shard map is already enroute to mem3_shards by using ets:insert_new and canceling their writer if that returns false.

      PR incoming.

        Activity

        Show
        paul.joseph.davis Paul Joseph Davis added a comment - PR: https://github.com/apache/couchdb/pull/476
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 6bd2d63f101c2a5f4bc0587e3809cdabd1a00693 in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=6bd2d63 ]

        Add unit tests for mem3_shards race condition and writer process

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit 6bd2d63f101c2a5f4bc0587e3809cdabd1a00693 in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=6bd2d63 ] Add unit tests for mem3_shards race condition and writer process COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 3746e3c44a905f06f85d77930015a62f4a52e00c in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=3746e3c ]

        Correctly delete writer information from ?OPENERS in mem3_shards

        `?OPENERS` is an ETS table of type bag. To delete one specific object have to
        use `ets:delete_object(Tab, Object)`

        Without this fix writers were never cleaned up and no new writers could be
        spawned.

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit 3746e3c44a905f06f85d77930015a62f4a52e00c in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=3746e3c ] Correctly delete writer information from ?OPENERS in mem3_shards `?OPENERS` is an ETS table of type bag. To delete one specific object have to use `ets:delete_object(Tab, Object)` Without this fix writers were never cleaned up and no new writers could be spawned. COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 48536fb00ec1e5a78b417d00bcf52c907adca0c7 in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Paul Joseph Davis
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=48536fb ]

        Tweak mem3_shard test contrbutions

        This chang is just to make the shard map assertions a bit more obvious
        so that we're not relying on `mock_shards/0` to return a constant.

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit 48536fb00ec1e5a78b417d00bcf52c907adca0c7 in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Paul Joseph Davis [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=48536fb ] Tweak mem3_shard test contrbutions This chang is just to make the shard map assertions a bit more obvious so that we're not relying on `mock_shards/0` to return a constant. COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit e4c3705def6021a6b801c0bc0ceaac4abbc7c0d8 in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Paul Joseph Davis
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=e4c3705 ]

        Fix stale shards cache

        There's a race condition in mem3_shards that can result in having shards
        in the cache for a database that's been deleted. This results in a
        confused cluster that thinks a database exists until you attempt to open
        it.

        The fix is to ignore any cache insert requests that come from an older
        version of the dbs db than mem3_shards cache knows about.

        Big thanks to @jdoane for the identification and original patch.

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit e4c3705def6021a6b801c0bc0ceaac4abbc7c0d8 in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Paul Joseph Davis [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=e4c3705 ] Fix stale shards cache There's a race condition in mem3_shards that can result in having shards in the cache for a database that's been deleted. This results in a confused cluster that thinks a database exists until you attempt to open it. The fix is to ignore any cache insert requests that come from an older version of the dbs db than mem3_shards cache knows about. Big thanks to @jdoane for the identification and original patch. COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a4b57a9e5d5b7a7147540a4d795eeca201fa4a1d in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a4b57a9 ]

        Add unit tests for mem3_shards

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit a4b57a9e5d5b7a7147540a4d795eeca201fa4a1d in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a4b57a9 ] Add unit tests for mem3_shards COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 32e5006f8d7e08579dbe9fab0e7e7ac6d78d7d40 in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=32e5006 ]

        Add unit tests for mem3_shards

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit 32e5006f8d7e08579dbe9fab0e7e7ac6d78d7d40 in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=32e5006 ] Add unit tests for mem3_shards COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/COUCHDB-3376-fix-mem3-shards from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ]

        Add unit tests for mem3_shards

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/ COUCHDB-3376 -fix-mem3-shards from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ] Add unit tests for mem3_shards COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit e4c3705def6021a6b801c0bc0ceaac4abbc7c0d8 in couchdb's branch refs/heads/master from Paul Joseph Davis
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=e4c3705 ]

        Fix stale shards cache

        There's a race condition in mem3_shards that can result in having shards
        in the cache for a database that's been deleted. This results in a
        confused cluster that thinks a database exists until you attempt to open
        it.

        The fix is to ignore any cache insert requests that come from an older
        version of the dbs db than mem3_shards cache knows about.

        Big thanks to @jdoane for the identification and original patch.

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit e4c3705def6021a6b801c0bc0ceaac4abbc7c0d8 in couchdb's branch refs/heads/master from Paul Joseph Davis [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=e4c3705 ] Fix stale shards cache There's a race condition in mem3_shards that can result in having shards in the cache for a database that's been deleted. This results in a confused cluster that thinks a database exists until you attempt to open it. The fix is to ignore any cache insert requests that come from an older version of the dbs db than mem3_shards cache knows about. Big thanks to @jdoane for the identification and original patch. COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/master from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ]

        Add unit tests for mem3_shards

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/master from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ] Add unit tests for mem3_shards COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a3ec4fef1a68628ea7fba161399cb46aea7280f6 in couchdb's branch refs/heads/master from Paul Joseph Davis
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a3ec4fe ]

        Merge pull request #476 from apache/COUCHDB-3376-fix-mem3-shards

        Couchdb 3376 fix mem3 shards

        Show
        jira-bot ASF subversion and git services added a comment - Commit a3ec4fef1a68628ea7fba161399cb46aea7280f6 in couchdb's branch refs/heads/master from Paul Joseph Davis [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a3ec4fe ] Merge pull request #476 from apache/ COUCHDB-3376 -fix-mem3-shards Couchdb 3376 fix mem3 shards
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/COUCHDB-3288-mixed-cluster-upgrade from Nick Vatamaniuc
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ]

        Add unit tests for mem3_shards

        COUCHDB-3376

        Show
        jira-bot ASF subversion and git services added a comment - Commit c1c6891aafac548314c8eb610b8e63f1997b107c in couchdb's branch refs/heads/ COUCHDB-3288 -mixed-cluster-upgrade from Nick Vatamaniuc [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=c1c6891 ] Add unit tests for mem3_shards COUCHDB-3376
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit a3ec4fef1a68628ea7fba161399cb46aea7280f6 in couchdb's branch refs/heads/COUCHDB-3288-mixed-cluster-upgrade from Paul Joseph Davis
        [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a3ec4fe ]

        Merge pull request #476 from apache/COUCHDB-3376-fix-mem3-shards

        Couchdb 3376 fix mem3 shards

        Show
        jira-bot ASF subversion and git services added a comment - Commit a3ec4fef1a68628ea7fba161399cb46aea7280f6 in couchdb's branch refs/heads/ COUCHDB-3288 -mixed-cluster-upgrade from Paul Joseph Davis [ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=a3ec4fe ] Merge pull request #476 from apache/ COUCHDB-3376 -fix-mem3-shards Couchdb 3376 fix mem3 shards

          People

          • Assignee:
            Unassigned
            Reporter:
            paul.joseph.davis Paul Joseph Davis
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development