CouchDB
  1. CouchDB
  2. COUCHDB-1334

Indexer speedup (for non-native view servers)

    Details

    • Skill Level:
      Guru Level (Everyone buy this person a beer at the next conference!)

      Description

      The following 2 patches significantly improve view index generation/update time and reduce CPU consumption.

      The first patch makes the view updater's batching more efficient, by ensuring each btree bulk insertion adds/removes a minimum of N (=100) key/value pairts. This also makes the index file size grow not so fast with old data (old btree nodes basically). This behaviour is already done in master/trunk in the new indexer (by Paul Davis).

      The second patch maximizes the throughput with an external view server (such as couchjs). Basically it makes the pipe (erlang port) communication between the Erlang VM (couch_os_process basically) and the view server more efficient since the 2 sides spend less time block on reading from the pipe.

      Here follow some benchmarks.

      test database at http://fdmanana.iriscouch.com/test_db (1 million documents)

      branch 1.2.x

      $ echo 3 > /proc/sys/vm/drop_caches
      $ time curl http://localhost:5984/test_db/_design/test/_view/test1
      {"rows":[

      {"key":null,"value":1000000}

      ]}

      real 2m45.097s
      user 0m0.006s
      sys 0m0.007s

      view file size: 333Mb

      CPU usage:

      $ sar 1 60
      22:27:20 %usr %nice %sys %idle
      22:27:21 38 0 12 50
      (....)
      22:28:21 39 0 13 49
      Average: 39 0 13 47

      branch 1.2.x + batch patch (first patch)

      $ echo 3 > /proc/sys/vm/drop_caches
      $ time curl http://localhost:5984/test_db/_design/test/_view/test1
      {"rows":[

      {"key":null,"value":1000000}

      ]}

      real 2m12.736s
      user 0m0.006s
      sys 0m0.005s

      view file size 72Mb

      branch 1.2.x + batch patch + os_process patch

      $ echo 3 > /proc/sys/vm/drop_caches
      $ time curl http://localhost:5984/test_db/_design/test/_view/test1
      {"rows":[

      {"key":null,"value":1000000}

      ]}

      real 1m9.330s
      user 0m0.006s
      sys 0m0.004s

      view file size: 72Mb

      CPU usage:

      $ sar 1 60
      22:22:55 %usr %nice %sys %idle
      22:23:53 22 0 6 72
      (....)
      22:23:55 22 0 6 72
      Average: 22 0 7 70

      master/trunk

      $ echo 3 > /proc/sys/vm/drop_caches
      $ time curl http://localhost:5984/test_db/_design/test/_view/test1
      {"rows":[

      {"key":null,"value":1000000}

      ]}

      real 1m57.296s
      user 0m0.006s
      sys 0m0.005s

      master/trunk + os_process patch

      $ echo 3 > /proc/sys/vm/drop_caches
      $ time curl http://localhost:5984/test_db/_design/test/_view/test1
      {"rows":[

      {"key":null,"value":1000000}

      ]}

      real 0m53.768s
      user 0m0.006s
      sys 0m0.006s

        Issue Links

          Activity

          Hide
          Adam Kocoloski added a comment -

          COUCHDB-700 contains a similar patch to batch the view updater's writes to the btree. I'll take a closer look to see if there are significant differences between the two versions.

          Show
          Adam Kocoloski added a comment - COUCHDB-700 contains a similar patch to batch the view updater's writes to the btree. I'll take a closer look to see if there are significant differences between the two versions.
          Hide
          Adam Kocoloski added a comment -

          Hi Filipe, good to see that pipelining makes such a difference, but I'm not sure introducing these big changes on a release branch is a good idea. I thought the motivation for branching 1.2.x was to give it some time to stabilize and fix any bugs that we might find.

          Show
          Adam Kocoloski added a comment - Hi Filipe, good to see that pipelining makes such a difference, but I'm not sure introducing these big changes on a release branch is a good idea. I thought the motivation for branching 1.2.x was to give it some time to stabilize and fix any bugs that we might find.
          Hide
          Randall Leeds added a comment -

          Then again, Adam, we've got the SpiderMonkey issues to take care of and we still need to figure out whether or not that changes 1.2 into 2.0 (unless maybe we have enough new stuff for 1.2 that we would just release a not-forward-compatible-with-SM-trunk major version anyway). We should collectively make a decision about what to do with these branches soon.

          Show
          Randall Leeds added a comment - Then again, Adam, we've got the SpiderMonkey issues to take care of and we still need to figure out whether or not that changes 1.2 into 2.0 (unless maybe we have enough new stuff for 1.2 that we would just release a not-forward-compatible-with-SM-trunk major version anyway). We should collectively make a decision about what to do with these branches soon.
          Hide
          Adam Kocoloski added a comment -

          I vote we release 1.2 without tackling that issue. 1.2 has the new replicator, JSON in C, snappy compression ... plenty of great stuff that we want to get into users' hands.

          Show
          Adam Kocoloski added a comment - I vote we release 1.2 without tackling that issue. 1.2 has the new replicator, JSON in C, snappy compression ... plenty of great stuff that we want to get into users' hands.
          Hide
          Paul Joseph Davis added a comment -

          @Randall, now that 1.8.5 is out, we should be able to lean on that for quite a long time. If people want to package versions of SpiderMonkey trunk in their distro, I'm disinclined to put too much immediate effort into supporting those random versions.

          @Filipe, reading the patch I think the idea is pretty good in general but I'd implement it a bit differently. Firstly, the logic for whether or not pipelining is used shouldn't be exposed to the client. That's just going to entangle a whole bunch of API knowledge in the wrong place. I've been meaning to go back and finish the refactoring of couch_(os|native)_process and couch_query_servers which would make this behavior possible.

          The other part of this that might be interesting is the erlang:port_connect/2 call that can set the destination Pid for that port. I played with it a bit during my refactoring work but couldn't get it to work quite right. I didn't spend too much time figuring it out, but it might be a way to skip the intermediary process and extra message passing.

          http://erlang.org/doc/man/erlang.html#port_connect-2

          Show
          Paul Joseph Davis added a comment - @Randall, now that 1.8.5 is out, we should be able to lean on that for quite a long time. If people want to package versions of SpiderMonkey trunk in their distro, I'm disinclined to put too much immediate effort into supporting those random versions. @Filipe, reading the patch I think the idea is pretty good in general but I'd implement it a bit differently. Firstly, the logic for whether or not pipelining is used shouldn't be exposed to the client. That's just going to entangle a whole bunch of API knowledge in the wrong place. I've been meaning to go back and finish the refactoring of couch_(os|native)_process and couch_query_servers which would make this behavior possible. The other part of this that might be interesting is the erlang:port_connect/2 call that can set the destination Pid for that port. I played with it a bit during my refactoring work but couldn't get it to work quite right. I didn't spend too much time figuring it out, but it might be a way to skip the intermediary process and extra message passing. http://erlang.org/doc/man/erlang.html#port_connect-2
          Hide
          Filipe Manana added a comment -

          @Adam I haven't heard yet about plans to release 1.2.0 soon (where soon can mean in a few months). I think this gives a very good benefit that many users would be happy for. But I understand your point, it's not wrong.

          @Paul, yes makes sense. This was more of an experiment. I looked into port_connect/2 before and was getting exit_status errors before receiving any responses back, so I did something wrong before. I just updated the patch, against master, so that all this logic is into couch_os_process, doesn't spawn an helper process and uses port_connect. Let me know what you think. Thanks.

          Show
          Filipe Manana added a comment - @Adam I haven't heard yet about plans to release 1.2.0 soon (where soon can mean in a few months). I think this gives a very good benefit that many users would be happy for. But I understand your point, it's not wrong. @Paul, yes makes sense. This was more of an experiment. I looked into port_connect/2 before and was getting exit_status errors before receiving any responses back, so I did something wrong before. I just updated the patch, against master, so that all this logic is into couch_os_process, doesn't spawn an helper process and uses port_connect. Let me know what you think. Thanks.
          Hide
          Filipe Manana added a comment -

          Second pipeline patch against master

          Show
          Filipe Manana added a comment - Second pipeline patch against master
          Hide
          Paul Joseph Davis added a comment -

          @Filipe, Awesome, this is considerably better than the old version.

          As to this part:

          + % Can throw badarg error, when OsProc Pid is dead.
          + (catch port_connect(OsProc#os_proc.port, Pid))

          That looks like the key to what I hadn't managed to track down when I tried something similar. I'm pretty sure we should be fine with couchspawnkillable, but do we need to close the port here and/or ignore some port exit status messages in this process? The other thing that is a bit confusing is why OsProc's Pid would be dead while its sitting idle. I haven't thought through all the implications here I guess.

          Does this version maintain the same speedups as before? When I tried this approach I was actually doing it slightly differently by having the doc reader process send docs to the port which would then forward them directly to the writer process. There was some stuff that got a bit funky when I tried this though. IIRC it was something like, I had to pass deleted doc update_seq's directly to the writer process or it'd break if the last update was a deletion (cause the writer would never see the update seq). Anyway, just a thought.

          Good work on this.

          Show
          Paul Joseph Davis added a comment - @Filipe, Awesome, this is considerably better than the old version. As to this part: + % Can throw badarg error, when OsProc Pid is dead. + (catch port_connect(OsProc#os_proc.port, Pid)) That looks like the key to what I hadn't managed to track down when I tried something similar. I'm pretty sure we should be fine with couchspawnkillable, but do we need to close the port here and/or ignore some port exit status messages in this process? The other thing that is a bit confusing is why OsProc's Pid would be dead while its sitting idle. I haven't thought through all the implications here I guess. Does this version maintain the same speedups as before? When I tried this approach I was actually doing it slightly differently by having the doc reader process send docs to the port which would then forward them directly to the writer process. There was some stuff that got a bit funky when I tried this though. IIRC it was something like, I had to pass deleted doc update_seq's directly to the writer process or it'd break if the last update was a deletion (cause the writer would never see the update seq). Anyway, just a thought. Good work on this.
          Hide
          Filipe Manana added a comment -

          Paul, yep performance was the same as before, forgot to mention that.

          There's a thing, the port_connect in after might fail because of 2 things:
          1) os process died (it was linked to the port)
          2) readline got an error or timeout, closed the port and threw an exception, so the port_connect in after will fail with badarg because the port was closed

          This diff over the previous patch makes it more clear, besides adding a necessary unlink:

          • % Can throw badarg error, when OsProc Pid is dead.
          • (catch port_connect(OsProc#os_proc.port, Pid))
            + % Can throw badarg error, when OsProc Pid is dead or port was closed
            + % by the readline function on error/timeout.
            + (catch port_connect(OsProc#os_proc.port, Pid)),
            + unlink(OsProc#os_proc.port)

          (uploading new patch)

          Show
          Filipe Manana added a comment - Paul, yep performance was the same as before, forgot to mention that. There's a thing, the port_connect in after might fail because of 2 things: 1) os process died (it was linked to the port) 2) readline got an error or timeout, closed the port and threw an exception, so the port_connect in after will fail with badarg because the port was closed This diff over the previous patch makes it more clear, besides adding a necessary unlink: % Can throw badarg error, when OsProc Pid is dead. (catch port_connect(OsProc#os_proc.port, Pid)) + % Can throw badarg error, when OsProc Pid is dead or port was closed + % by the readline function on error/timeout. + (catch port_connect(OsProc#os_proc.port, Pid)), + unlink(OsProc#os_proc.port) (uploading new patch)
          Hide
          Filipe Manana added a comment -

          On no objection, I'll push master-4-0002-More-efficient-communication-with-the-view-server.patch to master.

          Show
          Filipe Manana added a comment - On no objection, I'll push master-4-0002-More-efficient-communication-with-the-view-server.patch to master.
          Hide
          Paul Joseph Davis added a comment -

          +1

          Show
          Paul Joseph Davis added a comment - +1
          Hide
          Filipe Manana added a comment -

          Latest patch applied against master.

          Show
          Filipe Manana added a comment - Latest patch applied against master.
          Hide
          Dave Cottlehuber added a comment -

          NB this break the Windows build, specifically the erlang VM hangs (no console access nor HTTP response) during tests such as purge. Killing couchjs causes the VM to be freed again, until the next deadlock.

          Viewed in windbg or similar debugger, couchjs has unexpected data in the buffer, and is waiting for a response from the port. The port is apparently waiting on data to return from couchjs, a mexican standoff occurs. Why this causes the VM to hang is not yet clear. More details (and a lot of unrelated stuff) in COUCHDB-1346.

          Show
          Dave Cottlehuber added a comment - NB this break the Windows build, specifically the erlang VM hangs (no console access nor HTTP response) during tests such as purge. Killing couchjs causes the VM to be freed again, until the next deadlock. Viewed in windbg or similar debugger, couchjs has unexpected data in the buffer, and is waiting for a response from the port. The port is apparently waiting on data to return from couchjs, a mexican standoff occurs. Why this causes the VM to hang is not yet clear. More details (and a lot of unrelated stuff) in COUCHDB-1346 .
          Hide
          Filipe Manana added a comment -

          Dave,

          I think we need to pass the option 'overlapped_io' to open_port call in couch_os_process.
          This is to ensure parallel reads and writes to the underlying pipe on Windows are not blocking [1], allowing erlang C module to do async IO [2] with the pipe.
          Are you able to try this?

          Reducing the buffer size to something very small such as 8 bytes, doesn't block on Linux nor OS X (both small, such as 20 bytes docs, and very large docs such as 100Kb docs).
          So this seems like a pure Windows specific issue.

          [1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365150(v=vs.85).aspx
          [2] https://github.com/erlang/otp/blob/maint/erts/emulator/sys/win32/sys.c#L2246

          Show
          Filipe Manana added a comment - Dave, I think we need to pass the option 'overlapped_io' to open_port call in couch_os_process. This is to ensure parallel reads and writes to the underlying pipe on Windows are not blocking [1] , allowing erlang C module to do async IO [2] with the pipe. Are you able to try this? Reducing the buffer size to something very small such as 8 bytes, doesn't block on Linux nor OS X (both small, such as 20 bytes docs, and very large docs such as 100Kb docs). So this seems like a pure Windows specific issue. [1] http://msdn.microsoft.com/en-us/library/windows/desktop/aa365150(v=vs.85).aspx [2] https://github.com/erlang/otp/blob/maint/erts/emulator/sys/win32/sys.c#L2246
          Hide
          Dave Cottlehuber added a comment -

          Reopening per history in COUCHDB-1346.

          Show
          Dave Cottlehuber added a comment - Reopening per history in COUCHDB-1346 .
          Hide
          Dave Cottlehuber added a comment -

          I'm in agreement with Filipe, this is some weird Windows breakage.

          Including overlapped_io per http://www.erlang.org/doc/man/erlang.html is not sufficient.

          https://github.com/erlang/otp/blob/OTP_R15B03/erts/emulator/sys/win32/sys.c#L393 and https://github.com/erlang/otp/blob/OTP_R15B03/erts/emulator/sys/win32/sys.c#L2246 scare me.

          The following patch (enabling overlapped_io) on Erlang side passes 1.2.0 tests and 1.3.x branch with the patch reverted - i.e. so far this looks safe.

          diff --git i/src/couchdb/couch_os_process.erl w/src/couchdb/couch_os_process.erl
          index db62d49..d5ef857 100644
          — i/src/couchdb/couch_os_process.erl
          +++ w/src/couchdb/couch_os_process.erl
          @@ -20,7 +20,7 @@

          -include("couch_db.hrl").

          --define(PORT_OPTIONS, [stream,

          {line, 4096}, binary, exit_status, hide]).
          +-define(PORT_OPTIONS, [stream, {line, 4096}

          , binary, exit_status, hide, overlapped_io]).

          -record(os_proc,
          {command,

          However that's not enough for this to work with the full patch in place.

          A work-around to avoid the specific issue for Windows didn't pan out in time for the release timeframe:

          diff --git a/src/couchdb/couch_os_process.erl b/src/couchdb/couch_os_process.erl
          index 3a267be..5f45f5f 100644
          — a/src/couchdb/couch_os_process.erl
          +++ b/src/couchdb/couch_os_process.erl
          @@ -58,6 +58,14 @@ prompt(Pid, Data) ->
          end.

          prompt_many(Pid, DataList) ->
          + case os:type() of
          +

          {win32, _}

          ->
          + lists:map(fun(Data) -> prompt(Pid, Data) end, DataList);
          + _ ->
          + do_prompt_many(Pid, DataList)
          + end.
          +
          +do_prompt_many(Pid, DataList) ->
          OsProc = gen_server:call(Pid, get_os_proc, infinity),
          true = port_connect(OsProc#os_proc.port, self()),
          try

          Show
          Dave Cottlehuber added a comment - I'm in agreement with Filipe, this is some weird Windows breakage. Including overlapped_io per http://www.erlang.org/doc/man/erlang.html is not sufficient. https://github.com/erlang/otp/blob/OTP_R15B03/erts/emulator/sys/win32/sys.c#L393 and https://github.com/erlang/otp/blob/OTP_R15B03/erts/emulator/sys/win32/sys.c#L2246 scare me. The following patch (enabling overlapped_io) on Erlang side passes 1.2.0 tests and 1.3.x branch with the patch reverted - i.e. so far this looks safe. diff --git i/src/couchdb/couch_os_process.erl w/src/couchdb/couch_os_process.erl index db62d49..d5ef857 100644 — i/src/couchdb/couch_os_process.erl +++ w/src/couchdb/couch_os_process.erl @@ -20,7 +20,7 @@ -include("couch_db.hrl"). --define(PORT_OPTIONS, [stream, {line, 4096}, binary, exit_status, hide]). +-define(PORT_OPTIONS, [stream, {line, 4096} , binary, exit_status, hide, overlapped_io]). -record(os_proc, {command, However that's not enough for this to work with the full patch in place. A work-around to avoid the specific issue for Windows didn't pan out in time for the release timeframe: diff --git a/src/couchdb/couch_os_process.erl b/src/couchdb/couch_os_process.erl index 3a267be..5f45f5f 100644 — a/src/couchdb/couch_os_process.erl +++ b/src/couchdb/couch_os_process.erl @@ -58,6 +58,14 @@ prompt(Pid, Data) -> end. prompt_many(Pid, DataList) -> + case os:type() of + {win32, _} -> + lists:map(fun(Data) -> prompt(Pid, Data) end, DataList); + _ -> + do_prompt_many(Pid, DataList) + end. + +do_prompt_many(Pid, DataList) -> OsProc = gen_server:call(Pid, get_os_proc, infinity), true = port_connect(OsProc#os_proc.port, self()), try
          Hide
          ASF subversion and git services added a comment -

          Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/1334-revert-feature-view-server-pipelining from Dave Cottlehuber
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ]

          COUCHDB-1334 - revert "More efficient communication with the view server"

          This reverts commit a851c6e

          Show
          ASF subversion and git services added a comment - Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/1334-revert-feature-view-server-pipelining from Dave Cottlehuber [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ] COUCHDB-1334 - revert "More efficient communication with the view server" This reverts commit a851c6e COUCHDB-1334 breaks with Windows + couchjs in unexplained ways reducing to 1 concurrent query server is not sufficient Testing with open_port options overlapped_io was not in itself sufficient http://erlang.org/doc/man/erlang.html find overlapped_io Refer history in COUCHDB-1346
          Hide
          ASF subversion and git services added a comment -

          Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/1334-revert-feature-view-server-pipelining from Dave Cottlehuber
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ]

          COUCHDB-1334 - revert "More efficient communication with the view server"

          This reverts commit a851c6e

          Show
          ASF subversion and git services added a comment - Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/1334-revert-feature-view-server-pipelining from Dave Cottlehuber [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ] COUCHDB-1334 - revert "More efficient communication with the view server" This reverts commit a851c6e COUCHDB-1334 breaks with Windows + couchjs in unexplained ways reducing to 1 concurrent query server is not sufficient Testing with open_port options overlapped_io was not in itself sufficient http://erlang.org/doc/man/erlang.html find overlapped_io Refer history in COUCHDB-1346
          Hide
          Dave Cottlehuber added a comment -

          See branch 1334-revert-feature-view-server-pipelining

          Waiting on a couple eyes and a +1 for merging/committing this.

          Same patch as was applied to 1.3.x series, see sha#a851c6e5

          1334-fix-revert-feature-view-server-pipelining

          Show
          Dave Cottlehuber added a comment - See branch 1334-revert-feature-view-server-pipelining Waiting on a couple eyes and a +1 for merging/committing this. Same patch as was applied to 1.3.x series, see sha#a851c6e5 1334-fix-revert-feature-view-server-pipelining
          Hide
          Jan Lehnardt added a comment -

          Some more context.

          We decided to back this out of 1.3.x in the 12/12/12 IRC Meeting: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201212.mbox/%3cCAL+Y1nvmezMBDOT2R7v64bbH4752Nis7sTORFe5A5_0D+10CFg@mail.gmail.com%3e

          We also backed it out of master in preparation for 1.4.0 because the windows side of things isn’t solved yet.

          Robert Newson is looking into conditionally enabling the performance improvement on non-windows systems.

          Show
          Jan Lehnardt added a comment - Some more context. We decided to back this out of 1.3.x in the 12/12/12 IRC Meeting: http://mail-archives.apache.org/mod_mbox/couchdb-dev/201212.mbox/%3cCAL+Y1nvmezMBDOT2R7v64bbH4752Nis7sTORFe5A5_0D+10CFg@mail.gmail.com%3e We also backed it out of master in preparation for 1.4.0 because the windows side of things isn’t solved yet. Robert Newson is looking into conditionally enabling the performance improvement on non-windows systems.
          Hide
          Robert Newson added a comment -

          Let's back it out of 1.4 (i.e, from master before we fork 1.4.x) and look to restore the improvement in a way that works everywhere. It's too late to fix it for 1.4 and I'm not that comfortable having the code follow a different path based on os:type() after all.

          Show
          Robert Newson added a comment - Let's back it out of 1.4 (i.e, from master before we fork 1.4.x) and look to restore the improvement in a way that works everywhere. It's too late to fix it for 1.4 and I'm not that comfortable having the code follow a different path based on os:type() after all.
          Hide
          ASF subversion and git services added a comment -

          Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/master from Dave Cottlehuber
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ]

          COUCHDB-1334 - revert "More efficient communication with the view server"

          This reverts commit a851c6e

          Show
          ASF subversion and git services added a comment - Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/master from Dave Cottlehuber [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ] COUCHDB-1334 - revert "More efficient communication with the view server" This reverts commit a851c6e COUCHDB-1334 breaks with Windows + couchjs in unexplained ways reducing to 1 concurrent query server is not sufficient Testing with open_port options overlapped_io was not in itself sufficient http://erlang.org/doc/man/erlang.html find overlapped_io Refer history in COUCHDB-1346
          Hide
          ASF subversion and git services added a comment -

          Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/master from Dave Cottlehuber
          [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ]

          COUCHDB-1334 - revert "More efficient communication with the view server"

          This reverts commit a851c6e

          Show
          ASF subversion and git services added a comment - Commit 6ffc52796182a8d3673f3648c59c7e9abddf4fa2 in branch refs/heads/master from Dave Cottlehuber [ https://git-wip-us.apache.org/repos/asf?p=couchdb.git;h=6ffc527 ] COUCHDB-1334 - revert "More efficient communication with the view server" This reverts commit a851c6e COUCHDB-1334 breaks with Windows + couchjs in unexplained ways reducing to 1 concurrent query server is not sufficient Testing with open_port options overlapped_io was not in itself sufficient http://erlang.org/doc/man/erlang.html find overlapped_io Refer history in COUCHDB-1346
          Hide
          Dirkjan Ochtman added a comment -

          Since we're getting close to 1.5.0 now, any progress on getting this working again?

          Show
          Dirkjan Ochtman added a comment - Since we're getting close to 1.5.0 now, any progress on getting this working again?
          Hide
          Dave Cottlehuber added a comment -

          To clarify, for me personally this is a "Can't Fix" - not that I don't want to, but I've spent enough time trying to understand what's happening to know that I'm out of my depth in Windows-concurrent-io-stuff here. I'll gladly work with somebody who knows this but doesn't know couch or erlang to fix it. Filipe Manana on an entirely unrelated note I'll be at LXJS next week, maybe I can bribe you with dinner on this.

          Show
          Dave Cottlehuber added a comment - To clarify, for me personally this is a "Can't Fix" - not that I don't want to, but I've spent enough time trying to understand what's happening to know that I'm out of my depth in Windows-concurrent-io-stuff here. I'll gladly work with somebody who knows this but doesn't know couch or erlang to fix it. Filipe Manana on an entirely unrelated note I'll be at LXJS next week, maybe I can bribe you with dinner on this.
          Hide
          Joan Touzet added a comment -

          I was going to jump in with an os:type() wrapper, but Robert Newson suggested that we should "to restore the improvement in a way that works everywhere" and -1'd that. Robert Newson do you have any other ideas to start things off?

          Show
          Joan Touzet added a comment - I was going to jump in with an os:type() wrapper, but Robert Newson suggested that we should "to restore the improvement in a way that works everywhere" and -1'd that. Robert Newson do you have any other ideas to start things off?
          Hide
          James Birmingham added a comment -

          A performance gain in this area is of great practical use for many of us.
          The skills necessary to solve this for Windows isn't forthcoming but the majority of the user base is non Windows (I don't know the actual figures, do they even exist?).
          If there is a real world user on Windows who needs this fix then perhaps they will step up once this is released and the issue is wider known, please don't hold the rest of us back for the sake of purity of code.

          Show
          James Birmingham added a comment - A performance gain in this area is of great practical use for many of us. The skills necessary to solve this for Windows isn't forthcoming but the majority of the user base is non Windows (I don't know the actual figures, do they even exist?). If there is a real world user on Windows who needs this fix then perhaps they will step up once this is released and the issue is wider known, please don't hold the rest of us back for the sake of purity of code.
          Hide
          Joan Touzet added a comment -

          I'll take a look at getting this fixed on Windows - or bypassed if there is no alternative - in the next 10 days. Dave Cottlehuber I may poke you on IRC for assistance.

          Show
          Joan Touzet added a comment - I'll take a look at getting this fixed on Windows - or bypassed if there is no alternative - in the next 10 days. Dave Cottlehuber I may poke you on IRC for assistance.
          Hide
          Joan Touzet added a comment -

          Status update: I now have the Win32 dev env't working - thanks to Dave Cottlehuber for his help in getting this established. I have limited time tomorrow, but all day Friday to make progress. Stay tuned.

          Show
          Joan Touzet added a comment - Status update: I now have the Win32 dev env't working - thanks to Dave Cottlehuber for his help in getting this established. I have limited time tomorrow, but all day Friday to make progress. Stay tuned.
          Hide
          Joan Touzet added a comment -

          After 4 days on this, I thought I got close by tracking down an allocation error revealed by a debug ERTS build in the snappy NIF, which went away when I set compression = none. I ended up rewriting the NIF to use the new snappy C bindings rather than the Sink/Source approach in Filipe Manana's original NIF. I was all ready to declare victory.

          That is, until I compared notes with Dave Cottlehuber, who seems to get different experimental results. I'm going to have to wipe my build and start over, and see if I can duplicate the problem, and try all combinations - which includes erl.exe vs. werl.exe.

          Show
          Joan Touzet added a comment - After 4 days on this, I thought I got close by tracking down an allocation error revealed by a debug ERTS build in the snappy NIF, which went away when I set compression = none. I ended up rewriting the NIF to use the new snappy C bindings rather than the Sink/Source approach in Filipe Manana 's original NIF. I was all ready to declare victory. That is, until I compared notes with Dave Cottlehuber , who seems to get different experimental results. I'm going to have to wipe my build and start over, and see if I can duplicate the problem, and try all combinations - which includes erl.exe vs. werl.exe.

            People

            • Assignee:
              Joan Touzet
              Reporter:
              Filipe Manana
            • Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:

                Development