I did yesterday a lot of test on the replication yesterday and found why the replication was so slow and the CPU used so much on the "server".
The tests consisted in replicating 10000 docs from 1 "server" node on a machine in 100 then 10 databases on 1 "client" node on another machine. The replication was launched on the "client" node (source is the "server" node). The replications on the 100 dbs are launched concurrently. The point is to simulate more or less 100 devices replicating on the "server" node at the same time.s connections
For 100 replications tasks the time was ~1h with basic auth against ~10mn without. In the mean time teh CPU on the "server" node was at 700% with all the cores taken (8) and thoughtput was ~6-8Mb/s . Same diff apply to 10 concurrent replication tasks.
This morning I did a quick test going back on sha1 and the replication was faster x2. cpu was really less used < 100%
I think this is quite expected since we are doing on each request the following workflow:
- base64.decode(auth header)
- get user doc (cached or not)
- hash auth
- and compare to what we have in the doc or settings
But I think we should improve it asap. If one optimisation should be made that should be here imo. Playing with the number of itterations of the hashing helps indeed but isn't enough. Also cookie auth even if I didn't test it yet should be faster since it doens't try to iterrate or such but still.
Maybe we should introduce a session system in couch? Since at the end it will only consist in checking a tocken against another it may be faster. Or using other solution I don't yet. Others dbs are a lot faster on that purpose. ANyway I don't have a strong opinion on what the solution should be . But this is a big bottleneck for real usages of couchdb.