CouchDB
  1. CouchDB
  2. COUCHDB-487

Pause write requests to allow compaction to complete

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Database Core
    • Labels:
      None
    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      Continuous writes will currently indefinitely prevent the compaction process from flipping over to the .compact file.

      Here's a small patch that adds a new db flag called 'write_available' which becomes false the first time compaction completes but fails to flip over because of concurrent writes. Subsequent calls to update_docs then sleep for 1/2 a second and throw retry.

      I ran 'ab -p json -n 1000000 http://localhost:5984/db1' in one window. Without this patch, I get a long sequence of;

      [info] [<0.64.0>] Compaction file still behind main file (update seq=1848. compact update seq=1845). Retrying.

      and compaction never completes.

      With the patch, I get this;

      [info] [<0.369.0>] 127.0.0.1 - - 'POST' /db1 201
      [info] [<0.65.0>] Compaction file still behind main file (update seq=2140. compact update seq=2092). Retrying.
      [info] [<0.65.0>] Blocking writes to complete compaction.
      [info] [<0.65.0>] Compaction for db "db1" completed.
      [info] [<0.370.0>] 127.0.0.1 - - 'POST' /db1 201

        Activity

        Hide
        Robert Newson added a comment -

        block writes at compaction completion.

        Show
        Robert Newson added a comment - block writes at compaction completion.
        Hide
        Robert Newson added a comment -

        A better version of the same idea.

        Show
        Robert Newson added a comment - A better version of the same idea.
        Hide
        Adam Kocoloski added a comment -

        I don't think we want to be that aggressive about blocking incoming writes. This algorithm would cause a large DB to be unavailable for writes for several minutes, even if the write load is not so large as to prevent compaction from finishing without blocking. The reason is that each compaction iteration works with a single MVCC snapshot; any updates to the DB that occur during that iteration are compacted in the next pass.

        I've seen several cases where compaction finishes in 3-4 passes, with the first one taking several hours, the next one several minutes, the next one several seconds, and then it's done.

        I think if we're going to block writes we should only do it when we have some expectation that the non-blocking approach isn't working; e.g. we could track (update_seq - compact_update_seq) and if that difference grows between iterations, block until compaction is finished.

        Thanks for kicking off this ticket with some code, though!

        Show
        Adam Kocoloski added a comment - I don't think we want to be that aggressive about blocking incoming writes. This algorithm would cause a large DB to be unavailable for writes for several minutes, even if the write load is not so large as to prevent compaction from finishing without blocking. The reason is that each compaction iteration works with a single MVCC snapshot; any updates to the DB that occur during that iteration are compacted in the next pass. I've seen several cases where compaction finishes in 3-4 passes, with the first one taking several hours, the next one several minutes, the next one several seconds, and then it's done. I think if we're going to block writes we should only do it when we have some expectation that the non-blocking approach isn't working; e.g. we could track (update_seq - compact_update_seq) and if that difference grows between iterations, block until compaction is finished. Thanks for kicking off this ticket with some code, though!
        Hide
        Robert Newson added a comment -

        That's a fair concern. My preferred fix is to use multiple files that can be compacted/obsoleted separately, which makes it non-blocking even under heavy nd continuous writing.

        I was aware of the multi-pass nature of compaction when I wrote the patch, which is why it doesn't kick in at all unless it tries and fails to switch over to the compact file. I can modify the patch to only make write_available=false if the seq delta is below a configurable threshold, which is a nice change anyway, and the default can be 0, which is today's behavior.

        Show
        Robert Newson added a comment - That's a fair concern. My preferred fix is to use multiple files that can be compacted/obsoleted separately, which makes it non-blocking even under heavy nd continuous writing. I was aware of the multi-pass nature of compaction when I wrote the patch, which is why it doesn't kick in at all unless it tries and fails to switch over to the compact file. I can modify the patch to only make write_available=false if the seq delta is below a configurable threshold, which is a nice change anyway, and the default can be 0, which is today's behavior.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Newson
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development