Affects Version/s: None
Fix Version/s: None
Component/s: Database Core
Recently I started experimenting with batching writes in the DB updater.
For a test of 100 writers of 1Kb documents for e.g., most often the updater collects between 20 and 30 documents to write.
Currently it does a file:write operation for each one. Not only this is slower, but it implies more context switches and stressing the OS/filesystem by allocating few blocks very often (since we use a pure file append write mode). The same can be done in the BTree node writes.
The following branch/patch, is an experiment of batching writes:
In couch_file there's a quick test method that compares the time taken to write X blocks of size Y versus writing a single block of size X * Y.
Eshell V5.8.2 (abort with ^G)
1> Apache CouchDB 1.2.0aa777195-git (LogLevel=info) is starting.
Apache CouchDB has started. Time to relax.
[info] [<0.37.0>] Apache CouchDB has started on http://127.0.0.1:5984/
1> couch_file:test(1000, 30).
multi writes of 30 binaries, each of size 1000 bytes, took 1920us
batch write of 30 binaries, each of size 1000 bytes, took 344us
2> couch_file:test(4000, 30).
multi writes of 30 binaries, each of size 4000 bytes, took 2002us
batch write of 30 binaries, each of size 4000 bytes, took 700us
One order of magnitude less is quite significant I would say.
Lower response times are mostly noticeable when delayed_commits are set to true.
Running a writes only test with this branch gave me:
While with trunk I got:
These tests were done on Linux with ext4 (and OTP R14B01).
However I'm still not 100% sure if this worth applying to trunk.