CouchDB
  1. CouchDB
  2. COUCHDB-1075

Circular require's in CommonJS modules

    Details

      Description

      Having a CommonJS module A which requires B, when B also requires A causes the stack to fill up with require calls. A prerequisite for this fix is the caching of modules, even if it is only on a per-request basis.

      Patch incoming.

      1. module_cache.diff
        7 kB
        Caolan McMahon

        Activity

        Hide
        Caolan McMahon added a comment -

        Implements a temporary module cache for the duration of a request and fixes circular require's in commonjs modules

        Show
        Caolan McMahon added a comment - Implements a temporary module cache for the duration of a request and fixes circular require's in commonjs modules
        Hide
        Caolan McMahon added a comment -

        A more persistent module cache is implemented here:
        https://issues.apache.org/jira/browse/COUCHDB-890

        However, a module cache which persists between requests may have wider consequences which need discussing.

        Show
        Caolan McMahon added a comment - A more persistent module cache is implemented here: https://issues.apache.org/jira/browse/COUCHDB-890 However, a module cache which persists between requests may have wider consequences which need discussing.
        Hide
        Caolan McMahon added a comment -

        Just FYI, this patch makes a fairly normal Kanso (http://kansojs.org) app respond around 10x faster on the initial request (the one rendered by CouchDB). Removing the per-request limitation by commenting out the line "Couch.module_cache = {};" from State.reset you gain another 10x speed increase on-top of the previous for all subsequent requests.

        I think the small risks of users abusing the module_cache for storing state between requests (which you might argue is at their own risk) are by far outweighed by the performance increases for serious couchapps.

        Show
        Caolan McMahon added a comment - Just FYI, this patch makes a fairly normal Kanso ( http://kansojs.org ) app respond around 10x faster on the initial request (the one rendered by CouchDB). Removing the per-request limitation by commenting out the line "Couch.module_cache = {};" from State.reset you gain another 10x speed increase on-top of the previous for all subsequent requests. I think the small risks of users abusing the module_cache for storing state between requests (which you might argue is at their own risk) are by far outweighed by the performance increases for serious couchapps.
        Hide
        Alexander Shorin added a comment -

        A little complex solution, isn't it? You have access to ddoc via closure within compile function - why not to store compiled modules cache within it as ddoc functions does? So there wouldn't care about cache false hit and it would be always up to date if design document changes.

        Show
        Alexander Shorin added a comment - A little complex solution, isn't it? You have access to ddoc via closure within compile function - why not to store compiled modules cache within it as ddoc functions does? So there wouldn't care about cache false hit and it would be always up to date if design document changes.
        Hide
        Caolan McMahon added a comment -

        If I understand you correctly, you're suggesting compiled modules are stored as a property on the string containing their source code in the design document? That would work fine of course, but I thought having a separate module_cache made it easier to clear between requests. Now I think we should persist the cache between requests and if that's the case, storing it on the design doc probably makes more sense

        If we accept a more persistent cache, then you're right to point out that we need to make sure it's cleared when the design doc changes. This wasn't a consideration when the cache was cleared between requests, so I didn't add a test for it. I'll update the patch to include one!

        Thanks for the feedback

        Show
        Caolan McMahon added a comment - If I understand you correctly, you're suggesting compiled modules are stored as a property on the string containing their source code in the design document? That would work fine of course, but I thought having a separate module_cache made it easier to clear between requests. Now I think we should persist the cache between requests and if that's the case, storing it on the design doc probably makes more sense If we accept a more persistent cache, then you're right to point out that we need to make sure it's cleared when the design doc changes. This wasn't a consideration when the cache was cleared between requests, so I didn't add a test for it. I'll update the patch to include one! Thanks for the feedback
        Hide
        Alexander Shorin added a comment -

        Yes, your suggestions are right(: All design documents at first have been stored within local query server cache and only after that they could be used for any operations. If ddoc changes it will be overwritted in cache with all compiled functions that was stored.

        Same thing I've implemented for python query server and it works perfect.

        Show
        Alexander Shorin added a comment - Yes, your suggestions are right(: All design documents at first have been stored within local query server cache and only after that they could be used for any operations. If ddoc changes it will be overwritted in cache with all compiled functions that was stored. Same thing I've implemented for python query server and it works perfect.
        Hide
        Caolan McMahon added a comment -

        Now with tests for cache invalidation after a design doc update

        Show
        Caolan McMahon added a comment - Now with tests for cache invalidation after a design doc update
        Hide
        Caolan McMahon added a comment -

        Added tests and fixes for module resolution and module.id values

        Show
        Caolan McMahon added a comment - Added tests and fixes for module resolution and module.id values
        Hide
        Alexander Shorin added a comment -

        Two questions:
        1. Why new object created for cache proposes instead of storing compiled modules with replacing of original code? That could be required to create anonymous function before to not cache module, exports and require arguments, but anyway that could be more cheaper solution.
        2. I see that all tests have finished with 200 OK status with no errors, but circular dependencies are signs of architecture flaws, aren't they? Wouldn't it better to explicitly alert developer to fix code before it haven't create require's hell?

        No offence, I just interested in behavior synchronization(:

        Show
        Alexander Shorin added a comment - Two questions: 1. Why new object created for cache proposes instead of storing compiled modules with replacing of original code? That could be required to create anonymous function before to not cache module, exports and require arguments, but anyway that could be more cheaper solution. 2. I see that all tests have finished with 200 OK status with no errors, but circular dependencies are signs of architecture flaws, aren't they? Wouldn't it better to explicitly alert developer to fix code before it haven't create require's hell? No offence, I just interested in behavior synchronization(:
        Hide
        Caolan McMahon added a comment -

        1. I didn't want to replace the source of the module with a function and loose that information, so I tried adding a compiled property to the string (the same approach used in mikeal's patch: https://issues.apache.org/jira/browse/COUCHDB-890) but because strings are not passed around by reference, you can't extend the string object on the design doc with a new property easily... in fact I don't think that patch even works. Adding a _module_cache property seemed like the simplest way to store compiled modules without restructuring lots of code without much test coverage (and it lead me to fixing errors in the ways module ids are defined and module resolution works).

        2. Circular dependencies are handled like this in every other environment I can think of. In fact node.js does exactly the same... but you'll see similar behaviour from Python and others. I think developers will expect CouchDB to behave in the same way.

        Thanks again for the feedback Alexander

        Show
        Caolan McMahon added a comment - 1. I didn't want to replace the source of the module with a function and loose that information, so I tried adding a compiled property to the string (the same approach used in mikeal's patch: https://issues.apache.org/jira/browse/COUCHDB-890 ) but because strings are not passed around by reference, you can't extend the string object on the design doc with a new property easily... in fact I don't think that patch even works. Adding a _module_cache property seemed like the simplest way to store compiled modules without restructuring lots of code without much test coverage (and it lead me to fixing errors in the ways module ids are defined and module resolution works). 2. Circular dependencies are handled like this in every other environment I can think of. In fact node.js does exactly the same... but you'll see similar behaviour from Python and others. I think developers will expect CouchDB to behave in the same way. Thanks again for the feedback Alexander
        Hide
        Alexander Shorin added a comment -

        1. The question is actual because design doc function are cached in same way, so I just couldn't see any differences for modules. In fact of caching and dynamic language nature you could easily modify module nor any other function in same way and original code would be never executed once again to clean up this modifications. But this is on developers own conscience to use this feature or not(:

        2. Actually for Python, circular imports at top module would failed because requested objects wouldn't be defined yet. Placing imports down below of module could solves this, but discouraged by PEP-8. However, you are right, because there is specification of this function: http://wiki.commonjs.org/wiki/Modules/1.1 and p.3 is the answer(: But I suppose that current realization of require function doesn't accords to this specs in questions of top level modules and relative one, because currently all imports are relative no matter how their have defined.

        Show
        Alexander Shorin added a comment - 1. The question is actual because design doc function are cached in same way, so I just couldn't see any differences for modules. In fact of caching and dynamic language nature you could easily modify module nor any other function in same way and original code would be never executed once again to clean up this modifications. But this is on developers own conscience to use this feature or not(: 2. Actually for Python, circular imports at top module would failed because requested objects wouldn't be defined yet. Placing imports down below of module could solves this, but discouraged by PEP-8. However, you are right, because there is specification of this function: http://wiki.commonjs.org/wiki/Modules/1.1 and p.3 is the answer(: But I suppose that current realization of require function doesn't accords to this specs in questions of top level modules and relative one, because currently all imports are relative no matter how their have defined.
        Hide
        Alexander Shorin added a comment -

        I've created separated issue[1] about require function and ids. Could you try those solution for your's one?

        [1] https://issues.apache.org/jira/browse/COUCHDB-1151

        Show
        Alexander Shorin added a comment - I've created separated issue [1] about require function and ids. Could you try those solution for your's one? [1] https://issues.apache.org/jira/browse/COUCHDB-1151
        Hide
        Caolan McMahon added a comment -

        I don't see the point in storing a function on the design doc if the results of the function will never change. We may as well store the results of the function. The only issue is detecting that a require path has found a module and not an object. Using a function allows us to detect this because a design doc is JSON and won't contain them, but the results of a module can be anything, so we wouldn't know if the require path had worked or if we'd found a pre-compiled module. Having a separate store for compiled modules avoids this issue without having to constantly call a function which will always give the same result.

        Also, by caching the result and not the function you can be sure that instanceof checks will work correctly. Consider the following, where 'a' is the cached function for a module (instead of a module result):

        > a = function () {return {b: function ()

        { this.name = 'bee'; }

        }}
        [Function]
        > one = a();

        { b: [Function] }

        > two = a();

        { b: [Function] }

        > obj1 = new one.b()

        { name: 'bee' }

        > obj2 = new two.b()

        { name: 'bee' }

        > obj1 instanceof one.b
        true
        > obj1 instanceof two.b
        false

        Show
        Caolan McMahon added a comment - I don't see the point in storing a function on the design doc if the results of the function will never change. We may as well store the results of the function. The only issue is detecting that a require path has found a module and not an object. Using a function allows us to detect this because a design doc is JSON and won't contain them, but the results of a module can be anything, so we wouldn't know if the require path had worked or if we'd found a pre-compiled module. Having a separate store for compiled modules avoids this issue without having to constantly call a function which will always give the same result. Also, by caching the result and not the function you can be sure that instanceof checks will work correctly. Consider the following, where 'a' is the cached function for a module (instead of a module result): > a = function () {return {b: function () { this.name = 'bee'; } }} [Function] > one = a(); { b: [Function] } > two = a(); { b: [Function] } > obj1 = new one.b() { name: 'bee' } > obj2 = new two.b() { name: 'bee' } > obj1 instanceof one.b true > obj1 instanceof two.b false
        Hide
        mikeal added a comment -

        i just took a glance at it. looks good. but patches in this code tend to cause weird bugs we don't find until people actually use it, the face that there are tests added makes me feel good about it.

        i say merge it.

        Show
        mikeal added a comment - i just took a glance at it. looks good. but patches in this code tend to cause weird bugs we don't find until people actually use it, the face that there are tests added makes me feel good about it. i say merge it.
        Hide
        Paul Joseph Davis added a comment -

        @mikeal

        Sounds good, thanks for the check. I'll commit this tonight.

        Show
        Paul Joseph Davis added a comment - @mikeal Sounds good, thanks for the check. I'll commit this tonight.
        Hide
        Paul Joseph Davis added a comment -

        Committed this to trunk. Went to backport it to 1.1.x and it had a merge conflict in the module code. If someone wants to figure out why be my guest, otherwise it's not gonna make 1.1

        Show
        Paul Joseph Davis added a comment - Committed this to trunk. Went to backport it to 1.1.x and it had a merge conflict in the module code. If someone wants to figure out why be my guest, otherwise it's not gonna make 1.1
        Hide
        Paul Joseph Davis added a comment -

        Applied in 1102006

        Show
        Paul Joseph Davis added a comment - Applied in 1102006
        Hide
        Paul Joseph Davis added a comment -

        Nevermind. Soon as I saw the diff colorized it was an apparent conflict. Applied to 1.1.x as well.

        Show
        Paul Joseph Davis added a comment - Nevermind. Soon as I saw the diff colorized it was an apparent conflict. Applied to 1.1.x as well.
        Hide
        Max Desyatov added a comment -

        Circular require still doesn't work in 1.2.1, at least in validation functions. Try this one

        {
        "_id": "_design/validation",
        "validate_doc_update": "function (newDoc, oldDoc, userCtx, secObj) {\n if (!(newDoc._deleted != null)) {\n throw

        {\n forbidden: require('lib/test').meow\n }

        ;\n }\n }",
        "lib":

        { "test": "exports.blah = 1234; var two = require('./two'); exports.meow = two.meh + 1;", "two": "var test = require('./test'); exports.meh = test.blah + 1;" }

        }

        Expected result is 1236.
        In CouchDB 1.2.1 it always throws null, though according to CommonJS Modules spec (even 1.0): "If there is a dependency cycle, the foreign module may not have finished executing at the time it is required by one of its transitive dependencies; in this case, the object returned by "require" must contain at least the exports that the foreign module has prepared before the call to require that led to the current module's execution."

        Show
        Max Desyatov added a comment - Circular require still doesn't work in 1.2.1, at least in validation functions. Try this one { "_id": "_design/validation", "validate_doc_update": "function (newDoc, oldDoc, userCtx, secObj) {\n if (!(newDoc._deleted != null)) {\n throw {\n forbidden: require('lib/test').meow\n } ;\n }\n }", "lib": { "test": "exports.blah = 1234; var two = require('./two'); exports.meow = two.meh + 1;", "two": "var test = require('./test'); exports.meh = test.blah + 1;" } } Expected result is 1236. In CouchDB 1.2.1 it always throws null, though according to CommonJS Modules spec (even 1.0): "If there is a dependency cycle, the foreign module may not have finished executing at the time it is required by one of its transitive dependencies; in this case, the object returned by "require" must contain at least the exports that the foreign module has prepared before the call to require that led to the current module's execution."

          People

          • Assignee:
            Unassigned
            Reporter:
            Caolan McMahon
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development