Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-639

Make replication profit of attachment compression and improve push replication for large attachments

Details

    Description

      At the moment, for compressed attachments, the replication uncompresses and then compresses again the attachments. Therefore, a waste of CPU time.

      The push replication is also not reliable for very large attachments (500mb + for example). Currently it sends the attachments in-lined in the respective JSON doc. Not only this requires too much ram memory, it also wastes too much CPU time doing the base64 encoding of the attachment (and also a decompression if the attachment is compressed).

      The following patch (rep-att-comp-and-multipart-trunk*.patch) addresses both issues. Docs containing attachments are now streamed to the target remote DB using the multipart doc streaming feature provided by couch_doc.erl, and compressed attachments are not uncompressed and re-compressed during the replication

      JavaScript tests included.

      Previously doing a replication of a DB containing 2 docs with attachments of 100mb and 500mb caused the Erlang VM to consume near 1.2GB of ram memory in my system. With that patch applied, it uses about 130Mb of ram memory.

      Attachments

        1. rep-att-comp-and-multipart-trunk-2.patch
          25 kB
          Filipe David Borba Manana
        2. rep-att-comp-and-multipart-trunk.patch
          26 kB
          Filipe David Borba Manana

        Issue Links

          Activity

            People

              Unassigned Unassigned
              fdmanana Filipe David Borba Manana
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment