CouchDB
  1. CouchDB
  2. COUCHDB-1490

Problems with views on large documents JSONs

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      Mac Os x 10.6.8, intel Architecture (x86_64), 8Gb of Ram, Erlang R15B01 (erts-5.9.1)

    • Skill Level:
      Dont Know

      Description

      Hi,

      i run a couchdb server (v1.2.0) over a mac (intel architecture, 8gb of ram,
      os x version 10.6.8) installed with brew.

      The server itself is used as a storage of big jsons (example: https://raw.github.com/cvdlab-bio/webpdb/develop/docs/jsons/2LGB-pretty-print.json and https://raw.github.com/cvdlab-bio/webpdb/develop/docs/jsons/2CRK-pretty-print.json ) for a tiny uni project.

      When we load more than 3 of these jsons, all the map functions (we created to retrieve documents besides a simple get by id) does not work.
      A typical map is:

      function(doc)

      {if(doc.TITLE.title.match('.*INSULIN.*') !== null) emit(doc.ID, doc);}

      but even a

      function(doc)

      {emit(doc.ID, doc.ID)}

      cease to work.

      while when there are just 3 or 2 jsons in the database they work just fine. I tried increasing the stack for couchjs (1gb now, going over 1gb doesn't work it seems), increasing limits for files (4096), increasing timeout for processes but in the end i don't get any results and only a (Error:
      os_process_error

      {exit_status,0}

      ) from the db.

      1. error.log
        8 kB
        Francesco

        Activity

        Hide
        Ryan Richt added a comment - - edited

        Francesco,

        I'm guessing you dont need to do any M/R indexing over the protein structure description (the model / atoms) that make up most of the document.

        If this is true, move all of that sub-tree of the JSON to a binary attachment. You can't M/R over it, and the doc will be about the same size, but the amount of data the view indexer has to pack/unpack will be greatly reduced and your problem should go away. I know that's not a real solution, but it is more in-line with the intended use cases of CouchDB.

        we've seen documents work best when the JSON is a few kB to a few MB, but attachments can be GB range without issues.

        Show
        Ryan Richt added a comment - - edited Francesco, I'm guessing you dont need to do any M/R indexing over the protein structure description (the model / atoms) that make up most of the document. If this is true, move all of that sub-tree of the JSON to a binary attachment. You can't M/R over it, and the doc will be about the same size, but the amount of data the view indexer has to pack/unpack will be greatly reduced and your problem should go away. I know that's not a real solution, but it is more in-line with the intended use cases of CouchDB. we've seen documents work best when the JSON is a few kB to a few MB, but attachments can be GB range without issues.
        Hide
        Francesco added a comment - - edited

        Ryan,

        well ATM is true we don't because we stopped expanding the project due this problem, but querying some parts of the model might be done in the future if this problem is resolved.

        Thank you anyway for the suggestion... we'll try putting only essential parts of the JSON and add the full one as attachment (altough this is a bad bug :)

        Show
        Francesco added a comment - - edited Ryan, well ATM is true we don't because we stopped expanding the project due this problem, but querying some parts of the model might be done in the future if this problem is resolved. Thank you anyway for the suggestion... we'll try putting only essential parts of the JSON and add the full one as attachment (altough this is a bad bug :)
        Hide
        Francesco added a comment -

        We tried on an old linux (x86, 1 GB of RAM) machine with CouchDB 1.0.1 (Erlang R14B02/5.8.3) and the views are working with more than 3 JSON (we're trying to import more, to see what's the limit). Seems like a CouchDB 1.2.0 issue.

        Show
        Francesco added a comment - We tried on an old linux (x86, 1 GB of RAM) machine with CouchDB 1.0.1 (Erlang R14B02/5.8.3) and the views are working with more than 3 JSON (we're trying to import more, to see what's the limit). Seems like a CouchDB 1.2.0 issue.
        Hide
        Paul Joseph Davis added a comment -

        What version(s) of SpiderMonkey are you using?

        Show
        Paul Joseph Davis added a comment - What version(s) of SpiderMonkey are you using?
        Hide
        Francesco added a comment -

        On the linux machine (CouchDB 1.0.1) ldd reports limozjs185-1.0, mac os x (CouchDB 1.2.0) otool -L reports libmoxjs185.1.0.0 (i can assume it's the same as the linux one).

        Show
        Francesco added a comment - On the linux machine (CouchDB 1.0.1) ldd reports limozjs185-1.0, mac os x (CouchDB 1.2.0) otool -L reports libmoxjs185.1.0.0 (i can assume it's the same as the linux one).
        Hide
        Robert Newson added a comment -

        I get the timeout (OS X 10.7.4) with couchdb 1.2.0 using spidermonkey 1.7.0 but not when using spidermonkey 1.8.5. I've done this test several times and it's consistent. Works 100% with 1.8.5 (I used 20 copies of the 2LG document) and 0% with 1.7.0.

        Show
        Robert Newson added a comment - I get the timeout (OS X 10.7.4) with couchdb 1.2.0 using spidermonkey 1.7.0 but not when using spidermonkey 1.8.5. I've done this test several times and it's consistent. Works 100% with 1.8.5 (I used 20 copies of the 2LG document) and 0% with 1.7.0.
        Hide
        Francesco added a comment -

        This is the version of couchdb that brew downloads and compile http://ftp.mozilla.org/pub/mozilla.org/js/js185-1.0.0.tar.gz
        Nonetheless with 1.2.0 with the above configuration i get timeouts.

        Show
        Francesco added a comment - This is the version of couchdb that brew downloads and compile http://ftp.mozilla.org/pub/mozilla.org/js/js185-1.0.0.tar.gz Nonetheless with 1.2.0 with the above configuration i get timeouts.
        Hide
        Robert Newson added a comment -

        Yup. I am using homebrew for these tests.

        Show
        Robert Newson added a comment - Yup. I am using homebrew for these tests.
        Hide
        Francesco added a comment -

        Doing the same here, installing everything through homebrew.

        Show
        Francesco added a comment - Doing the same here, installing everything through homebrew.
        Hide
        Francesco added a comment -

        Hi, tried on Ubuntu 12.04 x64 (i7 16gb ram) compiled latest stable with https://github.com/iriscouch/build-couchdb/ libmoz1.8.5 and erlang V5.9 the result is the same as the one on macOSx (in attachment)

        Show
        Francesco added a comment - Hi, tried on Ubuntu 12.04 x64 (i7 16gb ram) compiled latest stable with https://github.com/iriscouch/build-couchdb/ libmoz1.8.5 and erlang V5.9 the result is the same as the one on macOSx (in attachment)
        Hide
        Beano added a comment - - edited

        I had the same problem on Ubuntu 12.10 with couchdb 1.2./js 1.8.5. I think I've got a fix/patch. I found that adjusting the size of the javascript runtime via JS_NewRuntime fixed it for me. It is currently hard-coded to 64 megs.

        I added a command-line flag to couchjs with reference to this similar issue:

        https://issues.apache.org/jira/browse/COUCHDB-893

        I've only compiled and tested against js 1.8.5. I would be happy to contribute a patch but I'd need some guidance on how to go about doing this.

        Show
        Beano added a comment - - edited I had the same problem on Ubuntu 12.10 with couchdb 1.2./js 1.8.5. I think I've got a fix/patch. I found that adjusting the size of the javascript runtime via JS_NewRuntime fixed it for me. It is currently hard-coded to 64 megs. I added a command-line flag to couchjs with reference to this similar issue: https://issues.apache.org/jira/browse/COUCHDB-893 I've only compiled and tested against js 1.8.5. I would be happy to contribute a patch but I'd need some guidance on how to go about doing this.
        Hide
        Dave Cottlehuber added a comment -

        Hi Beano, basically anything will do:

        • patch or diff file is fine - just attach to this ticket
        • If you prefer git/github you can send us a pull request from your fork of http://github.com/apache/couchdb/ or a git diff
        Show
        Dave Cottlehuber added a comment - Hi Beano, basically anything will do: patch or diff file is fine - just attach to this ticket If you prefer git/github you can send us a pull request from your fork of http://github.com/apache/couchdb/ or a git diff
        Hide
        Robert Kowalski added a comment -

        Abandoned issue.

        We are currently suggesting Spidermonkey 1.8.5 for installs

        Show
        Robert Kowalski added a comment - Abandoned issue. We are currently suggesting Spidermonkey 1.8.5 for installs

          People

          • Assignee:
            Unassigned
            Reporter:
            Francesco
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development