Details

    • Type: Sub-task Sub-task
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.1
    • Fix Version/s: None
    • Component/s: JavaScript View Server
    • Labels:
      None
    • Environment:

      All

    • Skill Level:
      New Contributors Level (Easy)

      Description

      New SpiderMonkey releases do not eval() a sole anonymous function expression. That is not a valid JavaScript statement, and so it is not a valid JavaScript script.

      COUCHDB-1302 addressed this for 1.1 and the 1.1.x branch. This ticket is for 1.2. (Sorry to spam COUCHDB-1302. I saw "Unassigned" and read "Unresolved.")

        Activity

        Hide
        Jason Smith added a comment -
          1. Facts

        1. All documentation, all books, all online resources, and almost all code use the "function expression" style to program CouchDB:
        { "views":{"people":{"map":
        "function(doc)

        { emit(doc.name, 1) }

        "
        }}}

        2. The code in the string is an invalid JavaScript statement, and also an invalid program.

        3. (Computer scientists, divert your gaze!) The code is a valid expression, or perhaps an r-value. You can copy and paste it to the right-hand side of an assignment statement, or paste it as a parameter in a function call.

        4. CouchDB inadvertently exploited a now-fixed SM bug where you can just eval() the expression, it is treated as a statement, and eval returns the compiled function object.

        5. Distinct from Fact 4, CouchDB users inadvertently exploited the bug by writing what are more properly called "programs", with helper functions, and a final (incorrect JavaScript) function expression serving as the entry point for Couch to call into.

        6. There may be, if the community wants, a distinction between valid JavaScript programming and valid CouchDB programming.

          1. Opinions

        Those clever anonymous function expressions are relaxing. They make intuitive sense. Learning how map/reduce works is one of the major emotional payoffs CouchDB provides to a learner.

        Asking the wiki maintainers, the book authors, and the CouchDB users to change anything, even one character, is not relaxing, major release or no.

        Instead we should ask something else:

        1. CouchDB should support the same old same-old. Even on newer SpiderMonkey.
        2. People who wrote full-on programs (helper functions, global state, etc.) must adhere to the longstanding documentation and change their code. Perhaps CouchDB or its community might show how.

        Finally, unless someone can show otherwise, we should assume that very few people wrote programs instead of expressions and so CouchDB can justifiably start throwing errors on their code in the 1.2.x branches.

        Show
        Jason Smith added a comment - Facts 1. All documentation, all books, all online resources, and almost all code use the "function expression" style to program CouchDB: { "views":{"people":{"map": "function(doc) { emit(doc.name, 1) } " }}} 2. The code in the string is an invalid JavaScript statement, and also an invalid program. 3. (Computer scientists, divert your gaze!) The code is a valid expression, or perhaps an r-value. You can copy and paste it to the right-hand side of an assignment statement, or paste it as a parameter in a function call. 4. CouchDB inadvertently exploited a now-fixed SM bug where you can just eval() the expression, it is treated as a statement, and eval returns the compiled function object. 5. Distinct from Fact 4, CouchDB users inadvertently exploited the bug by writing what are more properly called "programs", with helper functions, and a final (incorrect JavaScript) function expression serving as the entry point for Couch to call into. 6. There may be, if the community wants, a distinction between valid JavaScript programming and valid CouchDB programming. Opinions Those clever anonymous function expressions are relaxing. They make intuitive sense. Learning how map/reduce works is one of the major emotional payoffs CouchDB provides to a learner. Asking the wiki maintainers, the book authors, and the CouchDB users to change anything, even one character, is not relaxing, major release or no. Instead we should ask something else: 1. CouchDB should support the same old same-old. Even on newer SpiderMonkey. 2. People who wrote full-on programs (helper functions, global state, etc.) must adhere to the longstanding documentation and change their code. Perhaps CouchDB or its community might show how. Finally, unless someone can show otherwise, we should assume that very few people wrote programs instead of expressions and so CouchDB can justifiably start throwing errors on their code in the 1.2.x branches.
        Hide
        Paul Joseph Davis added a comment -

        Describing this as exploiting a bug is incorrect. SpiderMonkey had a non-standard extension to JavaScript that allowed for this syntax to be considered valid. It was enabled by default.

        Recent SpiderMonkey trunk has removed this option so it is no longer an error. The logic underneath is interesting in that it has to specifically check for this condition (ie, its completely valid except for a single "is anonymous function as statement" check.

        The nicest (and probably least likely) solution would be for SpiderMonkey to add this option back but just disable it by default. I asked the SM team in IRC shortly after we found the issue and they expressed a very negative view of this approach so I doubt it's a possibility.

        For reference, Jason's example:

        { "views":{"people":{"map":
        "function(doc)

        { emit(doc.name, 1) }"
        }}}

        And a fixed example that could work (with modifications to couchjs):

        { "views":{"people":{"map":
        "function map(doc) { emit(doc.name, 1) }

        "
        }}}

        and alternatively:

        { "views":{"people":{"map":
        "var map = function(doc)

        { emit(doc.name, 1) }

        "
        }}}

        We can upgrade couchjs to accept either form quite easily. I think a reasonable plan would be to have a version of couchjs that accepts both and logs a deprecation warning about the old version. Then we just need to go on a PR campaign to get everyone to make the upgrade so they're prepared for when their code eventually needs to run on a newer SpiderMonkey.

        Also, I'm not convinced that this needs to be in 1.2. We still have a bit of time to prepare a 1.2.1 that includes this. 1.2 should've been out a long time ago and I'd rather we focus on getting that out and then worrying about this afterwards since we have a plausible backwards compatible upgrade path.

        Show
        Paul Joseph Davis added a comment - Describing this as exploiting a bug is incorrect. SpiderMonkey had a non-standard extension to JavaScript that allowed for this syntax to be considered valid. It was enabled by default. Recent SpiderMonkey trunk has removed this option so it is no longer an error. The logic underneath is interesting in that it has to specifically check for this condition (ie, its completely valid except for a single "is anonymous function as statement" check. The nicest (and probably least likely) solution would be for SpiderMonkey to add this option back but just disable it by default. I asked the SM team in IRC shortly after we found the issue and they expressed a very negative view of this approach so I doubt it's a possibility. For reference, Jason's example: { "views":{"people":{"map": "function(doc) { emit(doc.name, 1) }" }}} And a fixed example that could work (with modifications to couchjs): { "views":{"people":{"map": "function map(doc) { emit(doc.name, 1) } " }}} and alternatively: { "views":{"people":{"map": "var map = function(doc) { emit(doc.name, 1) } " }}} We can upgrade couchjs to accept either form quite easily. I think a reasonable plan would be to have a version of couchjs that accepts both and logs a deprecation warning about the old version. Then we just need to go on a PR campaign to get everyone to make the upgrade so they're prepared for when their code eventually needs to run on a newer SpiderMonkey. Also, I'm not convinced that this needs to be in 1.2. We still have a bit of time to prepare a 1.2.1 that includes this. 1.2 should've been out a long time ago and I'd rather we focus on getting that out and then worrying about this afterwards since we have a plausible backwards compatible upgrade path.
        Hide
        Jason Smith added a comment -

        Paul, tl;dr: I see your point but disagree because I love the old syntax. I also have a fixed JS server (i.e. the code couchjs runs) working in all (IMO) important cases. Would you care to at least see the patch?

        Next I continue my plea to retain CouchDB's aesthetic decency...

        After years of a relaxing API, why would CouchDB force us to stop writing perfectly valid functions and instead write valid programs? It is redundant. It is confusing. There is no obvious benefit to the user. It is Kafkaesque.

        The current syntax that better matches our mental model is this:

        { "views": {
        "example": {
        "map": "function(doc)

        { emit(1, 2) }"
        ...

        That is intuitively relaxing because it feels like Javascript:

        doc =
        {"views": {
        "example": {
        "map": function(doc) { emit(1, 2) }

        ...

        Please re-read your own examples. How many new ways does it confuse the user? And for what? Because we can't' be bothered to do anything except the simplest possible implementation: just run it through eval(). What about this view object:

        "Map": "function map(D)

        { emit(1,D) }

        // wrong JSON key"
        "map": "function mapo(D)

        { emit(1, D) }

        // fat-fingered the function name"

        Really, if CouchDB will insist that users begin writing programs and not just expressions, it makes more sense to eradicate the "map" and "reduce" fields and replace the whole object with a string.

        {"views": {
        "people": "
        function map(D)

        { emit(D.name, 1) }

        function reduce(K, V, re)

        { return sum(K) }

        "}
        }

        OTOH, given a previously well-formed view definition and newer SpiderMonkey, couchjs should DTRT, do what it takes to make it work, as the user has already begun relaxing.

        Show
        Jason Smith added a comment - Paul, tl;dr: I see your point but disagree because I love the old syntax. I also have a fixed JS server (i.e. the code couchjs runs) working in all (IMO) important cases. Would you care to at least see the patch? Next I continue my plea to retain CouchDB's aesthetic decency... After years of a relaxing API, why would CouchDB force us to stop writing perfectly valid functions and instead write valid programs? It is redundant. It is confusing. There is no obvious benefit to the user. It is Kafkaesque. The current syntax that better matches our mental model is this: { "views": { "example": { "map": "function(doc) { emit(1, 2) }" ... That is intuitively relaxing because it feels like Javascript: doc = {"views": { "example": { "map": function(doc) { emit(1, 2) } ... Please re-read your own examples. How many new ways does it confuse the user? And for what? Because we can't' be bothered to do anything except the simplest possible implementation: just run it through eval(). What about this view object: "Map": "function map(D) { emit(1,D) } // wrong JSON key" "map": "function mapo(D) { emit(1, D) } // fat-fingered the function name" Really, if CouchDB will insist that users begin writing programs and not just expressions, it makes more sense to eradicate the "map" and "reduce" fields and replace the whole object with a string. {"views": { "people": " function map(D) { emit(D.name, 1) } function reduce(K, V, re) { return sum(K) } "} } OTOH, given a previously well-formed view definition and newer SpiderMonkey, couchjs should DTRT, do what it takes to make it work, as the user has already begun relaxing.
        Hide
        Jason Smith added a comment -

        Paul's right, this is not a 1.2 issue, or at least not 1.2.0.

        Show
        Jason Smith added a comment - Paul's right, this is not a 1.2 issue, or at least not 1.2.0.
        Hide
        max ogden added a comment -

        another option is to require commonjs instead of eval() and then just prepend "module.exports =" to the existing syntax. then you're using the same module syntax as the rest of the server side javascript world

        Show
        max ogden added a comment - another option is to require commonjs instead of eval() and then just prepend "module.exports =" to the existing syntax. then you're using the same module syntax as the rest of the server side javascript world
        Hide
        Jason Smith added a comment -

        Love it. I'd still rather that Couch supports the same old syntax as always, but if we must start writing full-on scripts, CommonJS is the wisest direction.

        Show
        Jason Smith added a comment - Love it. I'd still rather that Couch supports the same old syntax as always, but if we must start writing full-on scripts, CommonJS is the wisest direction.
        Hide
        James Howe added a comment -

        For Paul's example alternatives, is it not a trivial operation to convert the old format to these internally before eval? Then users can do what they've always done (and don't have to worry about new ways to ty[po) and the SM-related problem is solved.

        Show
        James Howe added a comment - For Paul's example alternatives, is it not a trivial operation to convert the old format to these internally before eval? Then users can do what they've always done (and don't have to worry about new ways to ty[po) and the SM-related problem is solved.
        Hide
        Paul Joseph Davis added a comment -

        @Jason

        Of course I'd look at a patch if you have one. Though as I've mentioned before, anything that relies on parsing JavaScript or other source-to-source transforms is going to be heavily discouraged. Also, while I understand that this will change things from the old method, I'm not overly swayed by the relaxing arguments here. Fact of the matter is that this is invalid JavaScript, so regardless of what we do, it should at least be valid JavaScript. (And jslint and other command line tools would finally start working as well).

        @Jason, @Max, @James

        I tried doing some hacky shortcuts to account for this. Specifically, "(" + map_source + ")" because the "error" is literally "anonymous function at top level scope" or some such and the parens make it kosher. But this ends up breaking all of the code that looks like "var f = 1; function(doc)

        {emit(doc._id, f);}

        ;" which is a noticeable amount of JavaScript.

        @Jason, @Max, @James

        I did just think of an alternative solution which would be to make the new version use scripts instead of functions. The resulting syntax would be something like:

        "var doc = next_doc(); emit(doc._id, null)"

        Which would then just be executed once for every doc. Though going this route, we'd need some sort of context for the functions so people could precompile their CommonJS libs and the like.

        Show
        Paul Joseph Davis added a comment - @Jason Of course I'd look at a patch if you have one. Though as I've mentioned before, anything that relies on parsing JavaScript or other source-to-source transforms is going to be heavily discouraged. Also, while I understand that this will change things from the old method, I'm not overly swayed by the relaxing arguments here. Fact of the matter is that this is invalid JavaScript, so regardless of what we do, it should at least be valid JavaScript. (And jslint and other command line tools would finally start working as well). @Jason, @Max, @James I tried doing some hacky shortcuts to account for this. Specifically, "(" + map_source + ")" because the "error" is literally "anonymous function at top level scope" or some such and the parens make it kosher. But this ends up breaking all of the code that looks like "var f = 1; function(doc) {emit(doc._id, f);} ;" which is a noticeable amount of JavaScript. @Jason, @Max, @James I did just think of an alternative solution which would be to make the new version use scripts instead of functions. The resulting syntax would be something like: "var doc = next_doc(); emit(doc._id, null)" Which would then just be executed once for every doc. Though going this route, we'd need some sort of context for the functions so people could precompile their CommonJS libs and the like.
        Hide
        Jason Smith added a comment -

        @Paul

        We really need to define "invalid JavaScript" and distinguish it from "invalid CouchDB programming" which is why I wrote Fact #6.

        For example, Benoit suggested that we could provide only the function bodies. They are implicitly wrapped in a function(doc)

        { ... }

        . That's neat! In a world where CouchDB had always worked that way, we would all call "emit(doc._id, null)" valid CouchDB but we know it is invalid JavaScript (doc is unbound).

        I take you to mean that "valid CouchDB" implies "valid JavaScript."

        But valid CouchDB is more relaxing with anonymous functions ("invalid JavaScript"). Business as usual is beautiful.

        How could it be implemented on all SM versions? What if we catch the eval()? If the error is "anonymous function at a top level scope" then we try again with "("code")". Yes, it is a tiny source transformation, and that is bad; however it is a common idiom in Javascript when building code from data.

        (If you ask me, it is the helper function people who should see their code broken; not we who have followed the customs and conventions all along. Thus I prefer your original patch the most: wrap it in parens.)

        Show
        Jason Smith added a comment - @Paul We really need to define "invalid JavaScript" and distinguish it from "invalid CouchDB programming" which is why I wrote Fact #6. For example, Benoit suggested that we could provide only the function bodies . They are implicitly wrapped in a function(doc) { ... } . That's neat! In a world where CouchDB had always worked that way, we would all call "emit(doc._id, null)" valid CouchDB but we know it is invalid JavaScript (doc is unbound). I take you to mean that "valid CouchDB" implies "valid JavaScript." But valid CouchDB is more relaxing with anonymous functions ("invalid JavaScript"). Business as usual is beautiful. How could it be implemented on all SM versions? What if we catch the eval()? If the error is "anonymous function at a top level scope" then we try again with "(" code ")". Yes, it is a tiny source transformation, and that is bad; however it is a common idiom in Javascript when building code from data. (If you ask me, it is the helper function people who should see their code broken; not we who have followed the customs and conventions all along. Thus I prefer your original patch the most: wrap it in parens.)
        Hide
        Paul Joseph Davis added a comment -

        @Jason

        I reject the premise that there's a distinction to be made between valid JavaScript and valid "CouchDB programming". At this point I would say that if its not possible to have whatever we decide pass jslint in a copy/paste manner, then its not valid anything.

        Also, the function body thing is the same thing I suggested as an aside except that the wrapping it in a function part is useless and will lead to more issues than it solves. Also, it has the same issues that if you did do it, the included code would be executed everytime (ie, if you pulled in a large helper it would redefine everything again (also note, this is not the same as compilation)).

        Show
        Paul Joseph Davis added a comment - @Jason I reject the premise that there's a distinction to be made between valid JavaScript and valid "CouchDB programming". At this point I would say that if its not possible to have whatever we decide pass jslint in a copy/paste manner, then its not valid anything. Also, the function body thing is the same thing I suggested as an aside except that the wrapping it in a function part is useless and will lead to more issues than it solves. Also, it has the same issues that if you did do it, the included code would be executed everytime (ie, if you pulled in a large helper it would redefine everything again (also note, this is not the same as compilation)).
        Hide
        James Howe added a comment - - edited

        There's nothing "Invalid" about this javascript: {"map": "function(doc) {}"}, it's only the way it's used (i.e. blindly eval'd) that is invalid. A source-to-source translation cold be s/\bfunction(/function map(/.

        IMO, it is the people writing {"map": "var d=1; function(doc)

        { emit(doc._id, d);"}

        that should find their code not working, as they are using undocumented features.

        If changes are to be made, I much prefer Benoit's suggestion - having "doc" be implicit in the context and the value of "map" keys being function bodies. "emit" already works like this, so makes no difference to JSLintability.

        Would it be possible to implement something similar to NodeJS's vm module, to allow for more flexible and safe run-time-compilation than eval?

        Show
        James Howe added a comment - - edited There's nothing "Invalid" about this javascript: {"map": "function(doc) {}"}, it's only the way it's used (i.e. blindly eval'd) that is invalid. A source-to-source translation cold be s/\bfunction(/function map(/. IMO, it is the people writing {"map": "var d=1; function(doc) { emit(doc._id, d);"} that should find their code not working, as they are using undocumented features. If changes are to be made, I much prefer Benoit's suggestion - having "doc" be implicit in the context and the value of "map" keys being function bodies. "emit" already works like this, so makes no difference to JSLintability. Would it be possible to implement something similar to NodeJS's vm module, to allow for more flexible and safe run-time-compilation than eval?
        Hide
        Paul Joseph Davis added a comment -

        Erm, I'd like to point out that the string "function(doc) {}" actually is invalid JavaScript. Anonymous functions at global scope are specifically not ok. Although anonymous functions in a statement context are ok. Its a subtle distinction but its quite important.

        Also, just today I came across a scenario of compiling a regular expression outside the function scope as a valid use of the "helper" pattern.

        Show
        Paul Joseph Davis added a comment - Erm, I'd like to point out that the string "function(doc) {}" actually is invalid JavaScript. Anonymous functions at global scope are specifically not ok. Although anonymous functions in a statement context are ok. Its a subtle distinction but its quite important. Also, just today I came across a scenario of compiling a regular expression outside the function scope as a valid use of the "helper" pattern.
        Hide
        Jason Smith added a comment -

        And since when has the contents of "map" become a global scope? Who made that decision? Was it you?

        Really, if eval() is the only arrow in your quiver then it's no wonder you see the problem this way.

        And how have you decided so-called "source-to-source translations" are out of the question? They are the backbone of JavasScript encapsulation, undergirding CommonJS (CouchDB and Node), RequireJS, Ender.js, and no doubt more.

        It's just a couple of parens on either side. Look into your heart. You know it's the right call.

        Show
        Jason Smith added a comment - And since when has the contents of "map" become a global scope? Who made that decision? Was it you? Really, if eval() is the only arrow in your quiver then it's no wonder you see the problem this way. And how have you decided so-called "source-to-source translations" are out of the question? They are the backbone of JavasScript encapsulation, undergirding CommonJS (CouchDB and Node), RequireJS, Ender.js, and no doubt more. It's just a couple of parens on either side. Look into your heart. You know it's the right call.
        Hide
        Paul Joseph Davis added a comment -

        Firstly, I'm not thinking of eval, I'm thinking of JS_CompileScript/JS_CompileUCScript which is where this restriction is actually applied. Secondly, I didn't say transforms are out of the question, I said they are heavily discouraged. The reason for discouraging them is because they break shit as demonstrated by all the people that wrote in to tell me I broke shit with the simple paren wrapping approach not to mention that this is the sort of engineering that leads to SQL injection.

        Just because it's the easy answer for some folks doesn't necessarily mean its the right call.

        Show
        Paul Joseph Davis added a comment - Firstly, I'm not thinking of eval, I'm thinking of JS_CompileScript/JS_CompileUCScript which is where this restriction is actually applied. Secondly, I didn't say transforms are out of the question, I said they are heavily discouraged. The reason for discouraging them is because they break shit as demonstrated by all the people that wrote in to tell me I broke shit with the simple paren wrapping approach not to mention that this is the sort of engineering that leads to SQL injection. Just because it's the easy answer for some folks doesn't necessarily mean its the right call.
        Hide
        James Howe added a comment -

        The string "function(doc) {}" isnt invalid JavaScript - it's just a string. It only becomes an anonymous function at global scope if you pass it directly to eval().

        Show
        James Howe added a comment - The string "function(doc) {}" isnt invalid JavaScript - it's just a string. It only becomes an anonymous function at global scope if you pass it directly to eval().
        Hide
        Benoit Chesneau added a comment - - edited

        Having a json which would do { "views": "name": { "map": "function map(doc) {}"}} is somehow inelegant and redundant imo.

        The other solution { "views": "name": "function map(doc) {} .. function reduce(doc) {}" } may be confusing:

        • It can introduce the idea that people could share vars between this functions, while they are used at different states. For example what would happen if someone share the results of the map function to the reduce function.
        • Since these functions are used at different steps, why sharing them in the same member ?

        I think it may be better to keep the current split in different properties of the design doc. I like the idea of elasticsearch to just pass script with a global context inside.

        we could have something like:

        map = "some script" which could set a Ctx object containing all the needed info:

        ctx.doc
        ctx.ddoc ...

        then this script could just return or emit.

        What do you think?

        Show
        Benoit Chesneau added a comment - - edited Having a json which would do { "views": "name": { "map": "function map(doc) {}"}} is somehow inelegant and redundant imo. The other solution { "views": "name": "function map(doc) {} .. function reduce(doc) {}" } may be confusing: It can introduce the idea that people could share vars between this functions, while they are used at different states. For example what would happen if someone share the results of the map function to the reduce function. Since these functions are used at different steps, why sharing them in the same member ? I think it may be better to keep the current split in different properties of the design doc. I like the idea of elasticsearch to just pass script with a global context inside. we could have something like: map = "some script" which could set a Ctx object containing all the needed info: ctx.doc ctx.ddoc ... then this script could just return or emit. What do you think?
        Hide
        Paul Joseph Davis added a comment -

        @Benoit

        That's what I was trying to describe earlier. The draw backs that come to mind is that it's quite a break with the current behavior (and harder to detect the difference) and we'd probably want to add some sort of "context" member that is compiled and executed once when the map or reduce is run (only once for maps, and once per reduce call). Or we can just force people to abuse the commonjs cache maybe?

        Show
        Paul Joseph Davis added a comment - @Benoit That's what I was trying to describe earlier. The draw backs that come to mind is that it's quite a break with the current behavior (and harder to detect the difference) and we'd probably want to add some sort of "context" member that is compiled and executed once when the map or reduce is run (only once for maps, and once per reduce call). Or we can just force people to abuse the commonjs cache maybe?
        Hide
        Jason Smith added a comment -

        @Paul, good points.

        So, we disagree on where to make the trade-off. That's fine. I yield to you and the community if that is its consensus. (Is it?)

        CouchDB should support the ubiquitous, nice-looking, and relaxing "invalid" style even if that means "(" + ")". But it should not support helper function people. Obviously that is just IMO.

        But if CouchDB does change and we do start writing valid statements/programs in our ddocs, Max's suggestion of the standard-CommonJS-is clearly best.

        Show
        Jason Smith added a comment - @Paul, good points. So, we disagree on where to make the trade-off. That's fine. I yield to you and the community if that is its consensus. (Is it?) CouchDB should support the ubiquitous, nice-looking, and relaxing "invalid" style even if that means "(" + ")". But it should not support helper function people. Obviously that is just IMO. But if CouchDB does change and we do start writing valid statements/programs in our ddocs, Max's suggestion of the standard- CommonJS -is clearly best.
        Hide
        Marcello Nuccio added a comment -

        @Jason,

        I've found that, using valid Javascript in map/reduce, eases development for me, because I can use standard tools (uglifyjs, jshint, ...) without extra work.

        Moreover, I would also remove all global functions and inject them:

        function sampleMap(doc, view)

        { view.emit(doc._id, null); }

        because this makes unit testing much more easier than now.

        Ease of development and ease of testing it's much more relaxing for me, than the possibility to use anonymous functions.

        Show
        Marcello Nuccio added a comment - @Jason, I've found that, using valid Javascript in map/reduce, eases development for me, because I can use standard tools (uglifyjs, jshint, ...) without extra work. Moreover, I would also remove all global functions and inject them: function sampleMap(doc, view) { view.emit(doc._id, null); } because this makes unit testing much more easier than now. Ease of development and ease of testing it's much more relaxing for me, than the possibility to use anonymous functions.
        Hide
        Jason Smith added a comment -

        @Marcello, yes I am okay with the idea of JavaScript programs (I've realized the word "invalid" is not helpful in this discussion).

        However once again, CommonJS is the clear, obvious goal. It clarifies and unifies CouchDB, and CouchApps would immediately become Node.js apps (at least, you could require() their parts for unit testing which I have started to do as well.)

        I agree that emit(), log(), etc. should be global; but I disagree that they can be injected. There is a third way, the way CommonJS works. The code is implicitly wrapped in a larger function where `emit`, `log`, etc. had been passed as parameters.

        In CommonJS, your code is inside a function--not even conceptually, really. See share/server/util.js, around line 90.

        var s = "function (module, exports, require)

        { " + newModule.current + " }

        ";

        This is the sort of source code transforms I hope Paul would agree is fine. If newModule.current was wrong before (suppose it defined an anonymous function), then it is still wrong. And this pattern is well-known and well-exercised in several JavaScript environments. Perhaps one day there will be:

        var map_runner = "function (emit, log, sum)

        { " + map_value + "}

        "

        Show
        Jason Smith added a comment - @Marcello, yes I am okay with the idea of JavaScript programs (I've realized the word "invalid" is not helpful in this discussion). However once again, CommonJS is the clear, obvious goal. It clarifies and unifies CouchDB, and CouchApps would immediately become Node.js apps (at least, you could require() their parts for unit testing which I have started to do as well.) I agree that emit(), log(), etc. should be global; but I disagree that they can be injected. There is a third way, the way CommonJS works. The code is implicitly wrapped in a larger function where `emit`, `log`, etc. had been passed as parameters. In CommonJS, your code is inside a function--not even conceptually, really. See share/server/util.js, around line 90. var s = "function (module, exports, require) { " + newModule.current + " } "; This is the sort of source code transforms I hope Paul would agree is fine. If newModule.current was wrong before (suppose it defined an anonymous function), then it is still wrong. And this pattern is well-known and well-exercised in several JavaScript environments. Perhaps one day there will be: var map_runner = "function (emit, log, sum) { " + map_value + "} "
        Hide
        Marcello Nuccio added a comment -

        @Jason,

        CommonJS is fine, but it doesn't help for unit testing. CommonJS is a linker (runs on load time), dependency injection is done at run-time. One assembles code, the other object graph. Not the same thing.

        To ease unit testing, you need dependency injection. You can't easily test:

        function (doc) {
        var util = require('views/lib/util');
        if (util.isSomething(doc))

        { emit(doc._id, 1); }

        }

        It's doable (I'm doing it), but not easy. Testing the following is trivial:

        function (doc, view, require) {
        var util = require('views/lib/util');
        if (util.isSomething(doc))

        { view.emit(doc._id, 1); }

        }

        Globals, make the code harder to:

        • read (where do these values come from?);
        • test (how do I mock globals at run-time?);
        • write (you need to tell to your code-checker what globals you use.)
        Show
        Marcello Nuccio added a comment - @Jason, CommonJS is fine, but it doesn't help for unit testing. CommonJS is a linker (runs on load time), dependency injection is done at run-time. One assembles code, the other object graph. Not the same thing. To ease unit testing, you need dependency injection. You can't easily test: function (doc) { var util = require('views/lib/util'); if (util.isSomething(doc)) { emit(doc._id, 1); } } It's doable (I'm doing it), but not easy. Testing the following is trivial: function (doc, view, require) { var util = require('views/lib/util'); if (util.isSomething(doc)) { view.emit(doc._id, 1); } } Globals, make the code harder to: read (where do these values come from?); test (how do I mock globals at run-time?); write (you need to tell to your code-checker what globals you use.)
        Hide
        Alexander Shorin added a comment -

        [offtopic]
        Sorry for offtopic, but

        @Marcello,

        Why not to use black box testing instead of trying to mock every internal query server global function? Using QS directly as black box you will have warranty that your functions are tested well, because box is real, but you don't have to know how it works. In your way you testing for your own, may miss some details and output result could be different. This is not only about views that are not so complex as show or list functions. By the way, applying your idea to other functions I'll something weird for list function:

        function(head, req, provides, registerType, getRow, send, start, whatEverWillBeAddedInFuture)

        { ... }

        Looks not much relaxing(: Certainly, Javascript allows to skip some function arguments definition, but it still requires them in right order.

        And what about other query servers in this case? Currently, they are doing same work as Javascript one or just trying to do, but not all of them could skip arguments definition, has arguments vectors etc. This could create additional problems with migration from one QS to another.
        [/offtopic]

        About subject, I couldn't say anything important about function(...){} vs function name(...){} vs ... because I don't use Javascript query server, but in Python I have to write real functions with their names - it works as nice marker to find their crush in debug logs if it happens, so for me this is just nice feature(:

        P.S. Just a moment, what will happens with CoffeeScript in case of proposed changes?

        Show
        Alexander Shorin added a comment - [offtopic] Sorry for offtopic, but @Marcello, Why not to use black box testing instead of trying to mock every internal query server global function? Using QS directly as black box you will have warranty that your functions are tested well, because box is real, but you don't have to know how it works. In your way you testing for your own, may miss some details and output result could be different. This is not only about views that are not so complex as show or list functions. By the way, applying your idea to other functions I'll something weird for list function: function(head, req, provides, registerType, getRow, send, start, whatEverWillBeAddedInFuture) { ... } Looks not much relaxing(: Certainly, Javascript allows to skip some function arguments definition, but it still requires them in right order. And what about other query servers in this case? Currently, they are doing same work as Javascript one or just trying to do, but not all of them could skip arguments definition, has arguments vectors etc. This could create additional problems with migration from one QS to another. [/offtopic] About subject, I couldn't say anything important about function(...){} vs function name(...){} vs ... because I don't use Javascript query server, but in Python I have to write real functions with their names - it works as nice marker to find their crush in debug logs if it happens, so for me this is just nice feature(: P.S. Just a moment, what will happens with CoffeeScript in case of proposed changes?
        Hide
        Marcello Nuccio added a comment -

        @Alexander,
        this isn't as off-topic as it seems at first, because software design is heavily influenced by ease of testability.

        I've not proposed to inject every single function. I've proposed to inject the context. For a list function it may be:

        function aList(head, req, ctx) {
        var row;
        function renderRow(row)

        { //... }

        while((row = ctx.getRow()))

        { send(renderRow(row)); }

        }

        This greatly simplifies unit testing. And it does not imply that you mock "every internal query server global function". You can even do without mocking, but the point is that you are in control of it, so you can properly test whatever needs testing.

        Testing with the real query server, is done in end-to-end or integration testing. This is needed, but it does not eliminate the need for unit testing.

        Show
        Marcello Nuccio added a comment - @Alexander, this isn't as off-topic as it seems at first, because software design is heavily influenced by ease of testability. I've not proposed to inject every single function. I've proposed to inject the context. For a list function it may be: function aList(head, req, ctx) { var row; function renderRow(row) { //... } while((row = ctx.getRow())) { send(renderRow(row)); } } This greatly simplifies unit testing. And it does not imply that you mock "every internal query server global function". You can even do without mocking, but the point is that you are in control of it, so you can properly test whatever needs testing. Testing with the real query server, is done in end-to-end or integration testing. This is needed, but it does not eliminate the need for unit testing.
        Hide
        James Howe added a comment -

        @Marcello - for unit testing you can do what CommonsJS does (and Jason points to) in order to define the "globals".
        var map_runner = "function (emit, log, sum)

        { " + map_value + "}

        "

        Show
        James Howe added a comment - @Marcello - for unit testing you can do what CommonsJS does (and Jason points to) in order to define the "globals". var map_runner = "function (emit, log, sum) { " + map_value + "} "
        Hide
        Marcello Nuccio added a comment -

        @James, that's what I'm doing. I find that tests written this way are more complex with respect to tests I write for other things. But, I'm clearly missing something obvious here, since I'm alone on this.

        Show
        Marcello Nuccio added a comment - @James, that's what I'm doing. I find that tests written this way are more complex with respect to tests I write for other things. But, I'm clearly missing something obvious here, since I'm alone on this.
        Hide
        Jason Smith added a comment -

        @All: I am fascinated and excited about the discussion of unit testing Couch apps more easily. Would you mind moving this discussion to the dev@ and user@ list? (These JIRA tickets are sent there, so you can simply hit reply from your mailer.)

        @Dave, thanks again for your thoughts. In this thread, your arguments seem to have resonated more strongly than mine. I hope you'll chew over Max's point that, if our goal is more well-formed, valid Javascript as a means of exporting functionality to a larger framework, then CommonJS should be a strong favorite.

        Show
        Jason Smith added a comment - @All: I am fascinated and excited about the discussion of unit testing Couch apps more easily. Would you mind moving this discussion to the dev@ and user@ list? (These JIRA tickets are sent there, so you can simply hit reply from your mailer.) @Dave, thanks again for your thoughts. In this thread, your arguments seem to have resonated more strongly than mine. I hope you'll chew over Max's point that, if our goal is more well-formed, valid Javascript as a means of exporting functionality to a larger framework, then CommonJS should be a strong favorite.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jason Smith
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development