CouchDB
  1. CouchDB
  2. COUCHDB-249

Treat output rows of views as documents for other views to build upon

    Details

    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      Unless I manually copy the JSON rows of a view into a new document, I am unable to create new views that are computed from existing views. That is, it seems as if views are second class citizens compared to first-class documents.

      Suppose I wanted to find the spread between the cheapest suppliers and the most expensive suppliers of each fruit. I know it's possible to use one map/reduce to compute such a view, but I'd like to be able to re-use my existing "cheapest" and "costliest" views. That is, I'd like to use the document output of these views as input into another view.

      I started with the simple fruit store example in the CouchDB book. I developed a simple view called "cheapest" with the following map and reduce functions (the "costliest" view is the same as "cheapest" but except the reduce function's comparison is the other way around):

      function(doc) {
      var store, price, key;
      if (doc.item && doc.prices) {
      for (store in doc.prices) {
      price = doc.prices[store];
      key = doc.item;
      emit(key,

      {store:store, price:price}

      );
      }
      }
      }

      function(item,store) {
      var m = store[0];
      for (i in store)

      { if (m.price > store[i].price) m = store[i]; }

      return m;
      }

      The output is as follows:

      {"rows":[
      {"key":"apple","value":{"store":"Apples Express","price":0.79}},
      {"key":"banana","value":{"store":"Price Max","price":079}},
      {"key":"orange","value":{"store":"Citrus Circus","price":1.09}}
      ]}

      I'd like to develop a new view whose input is the output of the view above, but as far as I can tell, views only operate on documents, not the output of existing views. Am I missing something?

      1. couch_view_updaer.erl.patch.txt
        7 kB
        Viacheslav Seledkin
      2. couch_view_updater.erl
        14 kB
        Viacheslav Seledkin

        Activity

        Hide
        Viacheslav Seledkin added a comment - - edited

        I support this requirement. In my project I had to implement it by myself to perform computation schema what are not possible to implement using current view implementation. I am new with Erlang and my implementation is probably not efficient and reliable. I prefer to see this feature as built in for Couch DB.

        By the way if someone needs i posting my code that does the job.

        Show
        Viacheslav Seledkin added a comment - - edited I support this requirement. In my project I had to implement it by myself to perform computation schema what are not possible to implement using current view implementation. I am new with Erlang and my implementation is probably not efficient and reliable. I prefer to see this feature as built in for Couch DB. By the way if someone needs i posting my code that does the job.
        Hide
        Viacheslav Seledkin added a comment -

        this is modified couch_view_updater.erl of 739866 trunk version.
        To use it, on you view add the following comment at the beginning of the view code
        //put_results_to(db_name)
        function(doc) {
        if (doc.document){
        ..........

        the db db_name must exist,
        when after view is rebuilt you'll have all result from it in db_name db
        so you can create additional views on it. The incremental update of the view incrementally updates db_name content to keep it consistent. Use at your own risk, keep you data backups fresh

        Show
        Viacheslav Seledkin added a comment - this is modified couch_view_updater.erl of 739866 trunk version. To use it, on you view add the following comment at the beginning of the view code //put_results_to(db_name) function(doc) { if (doc.document){ .......... the db db_name must exist, when after view is rebuilt you'll have all result from it in db_name db so you can create additional views on it. The incremental update of the view incrementally updates db_name content to keep it consistent. Use at your own risk, keep you data backups fresh
        Hide
        Viacheslav Seledkin added a comment - - edited

        Also ataching patch file to couch_view_updater.erl to support multi level views

        Show
        Viacheslav Seledkin added a comment - - edited Also ataching patch file to couch_view_updater.erl to support multi level views
        Hide
        Alexander Uvarov added a comment -

        Why not follow erlang coding conventions? I am confused about all this getViewDBName and friends.

        Show
        Alexander Uvarov added a comment - Why not follow erlang coding conventions? I am confused about all this getViewDBName and friends.
        Hide
        Viacheslav Seledkin added a comment -

        I am not Erlanger in any sense I know nothing about Erlang coding culture, I wrote that code almost intuitively. It wouild be nice if someone revise code and publish it here

        Show
        Viacheslav Seledkin added a comment - I am not Erlanger in any sense I know nothing about Erlang coding culture, I wrote that code almost intuitively. It wouild be nice if someone revise code and publish it here
        Hide
        Mathijs Kwik added a comment -

        What's the status of this ticket?

        I would like this functionality very much. I tried the patch on 0.10 but it didn't apply
        For now I need an external process to watch _changes and feed this into new documents, but it would be much nicer if couch could handle this by itself.

        Any progress?
        Any related tickets? (couldn't find any myself)

        Show
        Mathijs Kwik added a comment - What's the status of this ticket? I would like this functionality very much. I tried the patch on 0.10 but it didn't apply For now I need an external process to watch _changes and feed this into new documents, but it would be much nicer if couch could handle this by itself. Any progress? Any related tickets? (couldn't find any myself)
        Hide
        Cortland Klein added a comment -

        Cloudant appears to have implemented a solution at http://support.cloudant.com/kb/views/chained-mapreduce-views .

        Show
        Cortland Klein added a comment - Cloudant appears to have implemented a solution at http://support.cloudant.com/kb/views/chained-mapreduce-views .
        Hide
        Christoph Zrenner added a comment -

        Hi all, I've been using CouchDB for about 6 months now (an amazing technology) and we implemented a "prediction engine" using a bayesian classifier inside a view. The parameters for the machine learning algorithm is in a commonjs array and every 24 hours we modify the parameters based on updates from the manual verified training data added that day (like in spam/ham classification), so then the view is rebuilt. It seems like this might be a use case that would be well served with chained views, so I'm adding it here to this ticket:

        So after the predictions are calculated, there are then a bunch of further analytics calculation that happen based on the predicted data, but the predictions are only temporary (the prediction parameters are updated every 24h). Right now, I'm doing the same bayes prediction calculation in each of the analytics views (inside a commonjs function "BayesPredictor") but this means that the same calculations are performed for each analytics view.

        Chaining the temporary and computationally intense prediction calculation output with a subsequent analysis view "feels" like it might be a good solution for this problem. I'm reluctant to write the predictions into another database as in the cloudant solution, if I was to go that route, then I think I may as well just keep updating my source documents by going through all_documents and updating the predictions on each document every 24h.

        Would very much appreciate any views on whether this is a "valid" use-case for native chained views and any advice on how I might implement this! Thanks.

        Show
        Christoph Zrenner added a comment - Hi all, I've been using CouchDB for about 6 months now (an amazing technology) and we implemented a "prediction engine" using a bayesian classifier inside a view. The parameters for the machine learning algorithm is in a commonjs array and every 24 hours we modify the parameters based on updates from the manual verified training data added that day (like in spam/ham classification), so then the view is rebuilt. It seems like this might be a use case that would be well served with chained views, so I'm adding it here to this ticket: So after the predictions are calculated, there are then a bunch of further analytics calculation that happen based on the predicted data, but the predictions are only temporary (the prediction parameters are updated every 24h). Right now, I'm doing the same bayes prediction calculation in each of the analytics views (inside a commonjs function "BayesPredictor") but this means that the same calculations are performed for each analytics view. Chaining the temporary and computationally intense prediction calculation output with a subsequent analysis view "feels" like it might be a good solution for this problem. I'm reluctant to write the predictions into another database as in the cloudant solution, if I was to go that route, then I think I may as well just keep updating my source documents by going through all_documents and updating the predictions on each document every 24h. Would very much appreciate any views on whether this is a "valid" use-case for native chained views and any advice on how I might implement this! Thanks.

          People

          • Assignee:
            Unassigned
            Reporter:
            Joey Lawrance
          • Votes:
            17 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:

              Development