Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.2, 1.1, 1.2
    • Component/s: Database Core
    • Labels:
      None

      Description

      The following short patch moves the MD5 verification outside the couch_file server:

      https://github.com/fdmanana/couchdb/commit/51c463d682c478dcb273bd88f1ef3046a709689f

      Despite being apparently so insignificant (couch_util:md5/1 takes about 700us in my machine), I get this significant results with relaximation:

      $ node tests/compare_write_and_read.js --wclients 100 --rclients 200 \
      -name1 md5_out -name2 trunk \
      -url1 http://localhost:5984/ -url2 http://localhost:5985/ \
      --duration 120

      run 1) http://graphs.mikeal.couchone.com/#/graph/5c859b3e7d1b9bd0488cfe271105130c

      run 2) http://graphs.mikeal.couchone.com/#/graph/5c859b3e7d1b9bd0488cfe2711051bba

      The documents used in the test have a size of about 1Kb.
      If nobody has an objection, I'll commit this to trunk.

        Activity

        Hide
        Adam Kocoloski added a comment -

        700µs is very significant. In fact, I'm kinda shocked that it takes that long. But given that it does, your results make sense. The server is stuck at about 1500 reads / sec when the md5 is done in the couch_file. What's 1 second / 1500? ~700µs.

        I'll double-check the md5 numbers on a couple of setups to make sure. I wonder if we should be looking into faster hashing at this point ...

        Anyway, great find Filipe.

        Show
        Adam Kocoloski added a comment - 700µs is very significant. In fact, I'm kinda shocked that it takes that long. But given that it does, your results make sense. The server is stuck at about 1500 reads / sec when the md5 is done in the couch_file. What's 1 second / 1500? ~700µs. I'll double-check the md5 numbers on a couple of setups to make sure. I wonder if we should be looking into faster hashing at this point ... Anyway, great find Filipe.
        Hide
        Filipe Manana added a comment -

        Seems like erlang:phash2/1 is slightly faster:

        Erlang R14B (erts-5.8.1) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false]

        Eshell V5.8.1 (abort with ^G)
        1> crypto:start().
        ok
        2> Data = crypto:rand_bytes(8 * 1024).
        <<16,140,225,70,15,147,93,220,103,134,3,164,133,141,199,
        80,46,218,50,16,50,2,8,188,222,170,27,61,141,...>>
        3>
        3> timer:tc(crypto, md5, [Data]).

        {80, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>}

        4> timer:tc(crypto, md5, [Data]).

        {72, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>}

        5> timer:tc(crypto, md5, [Data]).

        {73, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>}

        6>
        6> timer:tc(erlang, phash2, [Data]).

        {50,4852638}

        7> timer:tc(erlang, phash2, [Data]).

        {40,4852638}
        8> timer:tc(erlang, phash2, [Data]).{40,4852638}

        9>
        9> Data2 = crypto:rand_bytes(50 * 1024).
        <<193,139,221,153,68,71,7,147,122,135,218,225,180,33,31,
        222,215,248,23,120,138,230,166,167,205,108,89,110,33,...>>
        10>
        10> timer:tc(crypto, md5, [Data2]).

        {377, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>}
        11> timer:tc(crypto, md5, [Data2]).{377, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>}

        12> timer:tc(crypto, md5, [Data2]).

        {379, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>}

        13>
        13> timer:tc(erlang, phash2, [Data2]).

        {222,108981774}
        14> timer:tc(erlang, phash2, [Data2]).{222,108981774}

        15> timer:tc(erlang, phash2, [Data2]).

        {225,108981774}

        16>

        Show
        Filipe Manana added a comment - Seems like erlang:phash2/1 is slightly faster: Erlang R14B (erts-5.8.1) [source] [smp:2:2] [rq:2] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.8.1 (abort with ^G) 1> crypto:start(). ok 2> Data = crypto:rand_bytes(8 * 1024). <<16,140,225,70,15,147,93,220,103,134,3,164,133,141,199, 80,46,218,50,16,50,2,8,188,222,170,27,61,141,...>> 3> 3> timer:tc(crypto, md5, [Data] ). {80, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>} 4> timer:tc(crypto, md5, [Data] ). {72, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>} 5> timer:tc(crypto, md5, [Data] ). {73, <<30,219,172,41,172,202,13,154,202,91,72,231,184,66,173, 155>>} 6> 6> timer:tc(erlang, phash2, [Data] ). {50,4852638} 7> timer:tc(erlang, phash2, [Data] ). {40,4852638} 8> timer:tc(erlang, phash2, [Data] ).{40,4852638} 9> 9> Data2 = crypto:rand_bytes(50 * 1024). <<193,139,221,153,68,71,7,147,122,135,218,225,180,33,31, 222,215,248,23,120,138,230,166,167,205,108,89,110,33,...>> 10> 10> timer:tc(crypto, md5, [Data2] ). {377, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>} 11> timer:tc(crypto, md5, [Data2] ).{377, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>} 12> timer:tc(crypto, md5, [Data2] ). {379, <<240,201,111,48,30,171,6,111,245,28,100,41,129,48,92,166>>} 13> 13> timer:tc(erlang, phash2, [Data2] ). {222,108981774} 14> timer:tc(erlang, phash2, [Data2] ).{222,108981774} 15> timer:tc(erlang, phash2, [Data2] ). {225,108981774} 16>
        Hide
        Filipe Manana added a comment -

        Committed to trunk (revision 1043524).

        Show
        Filipe Manana added a comment - Committed to trunk (revision 1043524).
        Hide
        Filipe Manana added a comment -

        Backported to 1.1.x and 1.0.x as well.

        Show
        Filipe Manana added a comment - Backported to 1.1.x and 1.0.x as well.
        Hide
        Filipe Manana added a comment -

        Applied to trunk, 1.1.x and 1.0.x

        Show
        Filipe Manana added a comment - Applied to trunk, 1.1.x and 1.0.x

          People

          • Assignee:
            Filipe Manana
            Reporter:
            Filipe Manana
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development