CouchDB
  1. CouchDB
  2. COUCHDB-1444

missing_named_view error on existing javascript design doc and view

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.1
    • Fix Version/s: 1.2.1, 1.3
    • Component/s: Database Core
    • Environment:

      Ubuntu 11.01 64 bit Erlang R13B03

    • Skill Level:
      Dont Know

      Description

      Moved over from issue: https://issues.apache.org/jira/browse/COUCHDB-1225 which has similar symptoms but the view is written in Erlang.

      On our production server for no apparent reason, one of our views just suddenly stopped responding to requests. The design document was still visible in Futon and the "all" view did provide a list of documents. All other views in the ddoc responded with a 404

      {"error":"not_found","reason":"missing_named_view"}

      .

      Restarting the couchdb server resolved the issue, and I've as yet been unable to reproduce the problem.

      Here is the last successful log entry for the view:

      [Fri, 16 Mar 2012 13:14:19 GMT] [info] [<0.831.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_journey_id_and_sequence?startkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%2C%7B%7D%5D&endkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%5D&limit=1&descending=true&include_docs=true&reduce=false 200

      Many requests later to other documents and views, here is when requests stopped working, some 6 minutes later:

      [Fri, 16 Mar 2012 13:20:29 GMT] [info] [<0.4510.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_user_id_and_created_at?startkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%5D&endkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%2C%7B%7D%5D&reduce=true&skip=0&limit=1 404

      Here is the design document in question: https://gist.github.com/2050446

      I could see nothing in the logs out of the ordinary.

      Obviously, this problem is very alarming indeed and not something I've come across before in CouchDB. As you can see the view in question is related to Payments, which is something we really do not want to go wrong.

      Please let me know if I can provide more information.

        Issue Links

          Activity

          Hide
          Robert Newson added a comment -

          Possibly related to 1225 but this report is with the stock js viewer and an official CouchDB release so it seems the more interesting report.

          Show
          Robert Newson added a comment - Possibly related to 1225 but this report is with the stock js viewer and an official CouchDB release so it seems the more interesting report.
          Hide
          Robert Newson added a comment -

          Sam,

          Can you include the logs between the 200 and the 404? Does the view ever 'come back' or do you always restart?

          I'm pretty confident that the view is not 'gone' in the strong sense (not least because it can be regenerated from the database) but this could be a sign that some internal state has gone wrong.

          Show
          Robert Newson added a comment - Sam, Can you include the logs between the 200 and the 404? Does the view ever 'come back' or do you always restart? I'm pretty confident that the view is not 'gone' in the strong sense (not least because it can be regenerated from the database) but this could be a sign that some internal state has gone wrong.
          Hide
          Sam Lown added a comment -

          Log entries between working view request and 404. Rows containing sensitive data (emails, tokens, etc.) have been removed.

          Show
          Sam Lown added a comment - Log entries between working view request and 404. Rows containing sensitive data (emails, tokens, etc.) have been removed.
          Hide
          Sam Lown added a comment -

          Looking at the logs, and a bit before, it appears that the database stopped checkpointing view updates for the Payment design. Changes were made to other documents, and other views continued to update, but as soon as the Payments view was called no attempt to udpate the view was made and 404 was returned. Maybe this is related. A few log examples:

          [Fri, 16 Mar 2012 13:13:15 GMT] [info] [<0.275.0>] checkpointing view update at seq 65410 for maxi _design/Payment
          [Fri, 16 Mar 2012 13:13:15 GMT] [info] [<0.275.0>] checkpointing view update at seq 65421 for maxi _design/Payment

          [Fri, 16 Mar 2012 13:14:19 GMT] [info] [<0.831.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_journey_id_and_sequence?startkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%2C%7B%7D%5D&endkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%5D&limit=1&descending=true&include_docs=true&reduce=false 200

          [Fri, 16 Mar 2012 13:16:47 GMT] [info] [<0.12554.508>] checkpointing view update at seq 65422 for maxi _design/User

          [Fri, 16 Mar 2012 13:20:29 GMT] [info] [<0.4510.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_user_id_and_created_at?startkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%5D&endkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%2C%7B%7D%5D&reduce=true&skip=0&limit=1 404

          I've not seen anything else related to the Payment design doc out of the ordinary.

          Show
          Sam Lown added a comment - Looking at the logs, and a bit before, it appears that the database stopped checkpointing view updates for the Payment design. Changes were made to other documents, and other views continued to update, but as soon as the Payments view was called no attempt to udpate the view was made and 404 was returned. Maybe this is related. A few log examples: [Fri, 16 Mar 2012 13:13:15 GMT] [info] [<0.275.0>] checkpointing view update at seq 65410 for maxi _design/Payment [Fri, 16 Mar 2012 13:13:15 GMT] [info] [<0.275.0>] checkpointing view update at seq 65421 for maxi _design/Payment [Fri, 16 Mar 2012 13:14:19 GMT] [info] [<0.831.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_journey_id_and_sequence?startkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%2C%7B%7D%5D&endkey=%5B%229bd1647eb09fca1634a8a6129a8cff46%22%5D&limit=1&descending=true&include_docs=true&reduce=false 200 [Fri, 16 Mar 2012 13:16:47 GMT] [info] [<0.12554.508>] checkpointing view update at seq 65422 for maxi _design/User [Fri, 16 Mar 2012 13:20:29 GMT] [info] [<0.4510.531>] 192.168.163.3 - - 'GET' /maxi/_design/Payment/_view/by_user_id_and_created_at?startkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%5D&endkey=%5B%22a0d0912e031b8fd28c2f89f828eebb12%22%2C%7B%7D%5D&reduce=true&skip=0&limit=1 404 I've not seen anything else related to the Payment design doc out of the ordinary.
          Hide
          Matt Goodall added a comment -

          Looks like the same problem happened to me today, running:

          OS: Ubuntu 10.04.4 LTS
          Erlang: R13B03
          CouchDB: 1.2.0, compiled from the 1.2.x branch.

          As for Sam, views that had been working for some time (no design doc updates at all, in fact) suddenly started returning a 404 (not_found, missing_named_view) error.

          All failing views were in the same design doc. However, views in another design doc were still working correctly. So it seems as if a single design doc got forgotten somehow.

          No compaction (views or db) was running or had run recently.

          After restarting CouchDB, all views started working again.

          There's nothing in the log file that's any more useful than Sam already posted.

          I saw no obvious increase in open file descriptors or anything else that might reasonably explain a problem like this.

          Show
          Matt Goodall added a comment - Looks like the same problem happened to me today, running: OS: Ubuntu 10.04.4 LTS Erlang: R13B03 CouchDB: 1.2.0, compiled from the 1.2.x branch. As for Sam, views that had been working for some time (no design doc updates at all, in fact) suddenly started returning a 404 (not_found, missing_named_view) error. All failing views were in the same design doc. However, views in another design doc were still working correctly. So it seems as if a single design doc got forgotten somehow. No compaction (views or db) was running or had run recently. After restarting CouchDB, all views started working again. There's nothing in the log file that's any more useful than Sam already posted. I saw no obvious increase in open file descriptors or anything else that might reasonably explain a problem like this.
          Hide
          Stefan Kögl added a comment - - edited

          I experienced the same symptoms on 1.2 after view compaction stopped due to a duplicate (discussed in [1]). Requests for the view that contained the duplicate were returning missing_named_view until the next restart of Couch, all other views continued working.

          [1] https://mail-archives.apache.org/mod_mbox/couchdb-user/201204.mbox/%3CCAPinO9e9z4bndagMFrgLk3tUFkSUM2xBiyCq%3DKQim-63s-Xs%2Bw%40mail.gmail.com%3E

          Show
          Stefan Kögl added a comment - - edited I experienced the same symptoms on 1.2 after view compaction stopped due to a duplicate (discussed in [1] ). Requests for the view that contained the duplicate were returning missing_named_view until the next restart of Couch, all other views continued working. [1] https://mail-archives.apache.org/mod_mbox/couchdb-user/201204.mbox/%3CCAPinO9e9z4bndagMFrgLk3tUFkSUM2xBiyCq%3DKQim-63s-Xs%2Bw%40mail.gmail.com%3E
          Hide
          Stefan Kögl added a comment - - edited

          Seems unrelated to the duplicate issue I mentioned, because it happened again several times in the meantime - again on 1.2 (I can't update the ticket to reflect that). Unfortunately there was nothing useful in the log at info level.

          Show
          Stefan Kögl added a comment - - edited Seems unrelated to the duplicate issue I mentioned, because it happened again several times in the meantime - again on 1.2 (I can't update the ticket to reflect that). Unfortunately there was nothing useful in the log at info level.
          Hide
          Sascha Reuter added a comment -

          Same problem just hit one of our production servers!

          Ubuntu 12.04 LTS (GNU/Linux 3.2.0-23-generic x86_64)
          otp_src_R15B01
          apache-couchdb-1.2.0

          I'll quote Matt for completeness, as our situation was exactly the same:

          "As for Sam, views that had been working for some time (no design doc updates at all, in fact) suddenly started returning a 404 (not_found, missing_named_view) error.

          All failing views were in the same design doc. However, views in another design doc were still working correctly. So it seems as if a single design doc got forgotten somehow.

          No compaction (views or db) was running or had run recently.

          After restarting CouchDB, all views started working again.

          There's nothing in the log file that's any more useful than Sam already posted.

          I saw no obvious increase in open file descriptors or anything else that might reasonably explain a problem like this."

          Show
          Sascha Reuter added a comment - Same problem just hit one of our production servers! Ubuntu 12.04 LTS (GNU/Linux 3.2.0-23-generic x86_64) otp_src_R15B01 apache-couchdb-1.2.0 I'll quote Matt for completeness, as our situation was exactly the same: "As for Sam, views that had been working for some time (no design doc updates at all, in fact) suddenly started returning a 404 (not_found, missing_named_view) error. All failing views were in the same design doc. However, views in another design doc were still working correctly. So it seems as if a single design doc got forgotten somehow. No compaction (views or db) was running or had run recently. After restarting CouchDB, all views started working again. There's nothing in the log file that's any more useful than Sam already posted. I saw no obvious increase in open file descriptors or anything else that might reasonably explain a problem like this."
          Hide
          Alexander Shorin added a comment -

          Confirm for current 1.3.0@master (f0d6f19bc8) against python query server. Affected design doc had been edited directly via futon very recently before that problem raised. For any other cases ddoc manipulations was through couchapp with no such problems. Also confirm situation with no any compaction running or was recently, but restart hadn't change things. Resaving ddoc or changing his views hadn't spawn or activated any query server processed. To fix problem I have to remove "defected" ddoc and uploaded it back to restore views. All clear in logs: just info messages about GET requests that receives 404 error. Suddenly, I'd missed chance to handle what query server process served "defected" ddoc or had it been alive.

          Show
          Alexander Shorin added a comment - Confirm for current 1.3.0@master (f0d6f19bc8) against python query server. Affected design doc had been edited directly via futon very recently before that problem raised. For any other cases ddoc manipulations was through couchapp with no such problems. Also confirm situation with no any compaction running or was recently, but restart hadn't change things. Resaving ddoc or changing his views hadn't spawn or activated any query server processed. To fix problem I have to remove "defected" ddoc and uploaded it back to restore views. All clear in logs: just info messages about GET requests that receives 404 error. Suddenly, I'd missed chance to handle what query server process served "defected" ddoc or had it been alive.
          Hide
          Tuyen Tran added a comment -

          We've seen this problem twice now in the last 30 days (rock solid for two+ years otherwise). Our situation was same as above: view started returning missing_named_view; restart fixed the issue. View was trivial; design doc had just one view. Problem was localized to just that one view in that one database. Nothing useful in the log.

          VMWare
          Unbuntu 10.04 LTS (GNU/Linux 2.6.32-24-server x86_64)
          R13B03
          couchdb 1.1.1

          Show
          Tuyen Tran added a comment - We've seen this problem twice now in the last 30 days (rock solid for two+ years otherwise). Our situation was same as above: view started returning missing_named_view; restart fixed the issue. View was trivial; design doc had just one view. Problem was localized to just that one view in that one database. Nothing useful in the log. VMWare Unbuntu 10.04 LTS (GNU/Linux 2.6.32-24-server x86_64) R13B03 couchdb 1.1.1
          Hide
          Alexey Loshkarev added a comment -

          I'm encounting such problem almost every day on my couchdb 1.2.0 installation.

          The problem is in 2 databases with 8-10 design docs there.

          All this databases configured to be compacted every day:

          exhaust_orders = [

          {db_fragmentation, "50%"}

          ,

          {view_fragmentation, "30%"}]
          exhaust = [{db_fragmentation, "30%"}, {view_fragmentation, "30%"}

          ]

          My last logged problem description:

          First - I received notification from cron-daemon, doing view indexing every 1 minute for every databases (pinger).

          Notification is about "missing_named_view"

          There are log message (couchdb.log):

          [Tue, 24 Jul 2012 07:37:34 GMT] [info] [<0.24555.62>] 10.0.0.41 - - GET /exhaust_events/_design/driver/_view/events?limit=11&reduce=false 404

          I opened to Futon page with this view and press "compact view" button.

          Surprisingly, view compaction started for another view, not mine!
          Here are running process:

          View compaction exhaust_events, _design/events 2012-07-24 10:55:47 2012-07-24 10:56:07 <0.24784.62> Progress 12%

          But couchdb.log show me, as compaction was started for correct view:

          [Tue, 24 Jul 2012 07:55:47 GMT] [info] [<0.24555.62>] 10.0.0.41 - - POST /exhaust_events/_compact/driver 202

          No other compaction requests in log for that time period.

          Latelly, at compaction end, I see in couchdb.log:

          [Tue, 24 Jul 2012 07:58:39 GMT] [info] [<0.939.0>] View index compaction still behind for exhaust_events _design/events – current: 6473109 compact: 6473081
          [Tue, 24 Jul 2012 07:58:39 GMT] [info] [<0.939.0>] View index compaction complete for exhaust_events _design/events
          [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.939.0>] ** Generic server <0.939.0> terminating

            • Last message in was {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,
              95,56>>,
              <0.24781.62>,<<"_design/events">>,<<"javascript">>,
              [],
              [
              Unknown macro: {view,0,6473119,0, [<<"list">>], <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}
              ** When Server state == {group_state,undefined,<<"exhaust_events">>,
              {"/var/lib/couchdb",<<"exhaust_events">>,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,
              95,56>>,
              nil,<<"_design/events">>,<<"javascript">>,[],
              [{view,0,0,0,
              [<<"list">>],
              <<"function(doc) {n emit([doc.date, doc.facility], null);n}">>, nil,[],[]}

              ],

              {[]},
              nil,0,0,nil,nil}},
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,
              95,56>>,
              <0.940.0>,<<"_design/events">>,<<"javascript">>,[],
              [{view,0,6473109,0,
              [<<"list">>],
              <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.940.0>,
              {578555333,{5879456,[]},161851847},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]}

              ,

              Unknown macro: {btree,<0.940.0>, {578551416,[],157784808}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}

              ,
              6473109,0,nil,nil},
              nil,nil,false,[],<0.942.0>,false}

            • Reason for termination ==
            • Unknown macro: {badarg,[{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]}

              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.939.0>] {error_report,<0.32.0>,
              {<0.939.0>,crash_report,
              [[{initial_call,
              {couch_view_group,init,['Argument__1']}},
              {pid,<0.939.0>},
              {registered_name,[]},
              {error_info,
              {exit,
              {badarg,
              [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              [{gen_server,terminate,6},
              {proc_lib,init_p_do_apply,3}]}},
              {ancestors,[<0.938.0>]},
              {messages,[]},
              {links,[<0.940.0>,<0.126.0>]},
              {dictionary,[]},
              {trap_exit,true},
              {status,running},
              {heap_size,2584},
              {stack_size,24},
              {reductions,324488}],
              []]}}
              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.566.0>] ** Generic server <0.566.0> terminating
              ** Last message in was {'EXIT',<0.24784.62>,
              {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3}, {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,
              59,95,56>>,
              <0.24781.62>,<<"_design/events">>,
              <<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}]}}}
              ** When Server state == {group_state,undefined,<<"exhaust_events">>,
              {"/var/lib/couchdb",<<"exhaust_events">>,
              {group,
              <<52,56,143,106,60,215,14,14,141,162,67,151,219,113,
              2,81>>,
              nil,<<"_design/driver">>,<<"javascript">>,[],
              [{view,0,0,0,[],
              <<"function(doc) {\n\tif(doc.driver){\n \temit([doc.driver, doc.date.slice(0, 10), doc.date.slice(11, 19)], 1);\n\t}\n}">>,
              nil,
              [{<<"events">>,
              <<"function(keys, values) {\n return sum(values);\n}">>}],
              []}],
              {[]},
              nil,0,0,nil,nil}},
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,
              95,56>>,
              <0.940.0>,<<"_design/events">>,<<"javascript">>,[],
              [{view,0,6473109,0,
              [<<"list">>],
              <<"function(doc) {n emit([doc.date, doc.facility], null);n}">>,
              {btree,<0.940.0>,
              {578548274,{5879456,[]},161852193},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.940.0>,
              {578544609,[],157785117},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473109,0,nil,nil},
              nil,<0.24784.62>,false,[],<0.17570.0>,false}
              ** Reason for termination ==
              ** {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]},
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,95,56>>,
              <0.24781.62>,<<"_design/events">>,<<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}]}}

              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.566.0>] {error_report,<0.32.0>,
              {<0.566.0>,crash_report,
              [[{initial_call,{couch_view_group,init,['Argument__1']}},
              {pid,<0.566.0>},
              {registered_name,[]},
              {error_info,
              {exit,
              {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]}

              ,
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,
              135,59,95,56>>,
              <0.24781.62>,<<"_design/events">>,
              <<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc)

              {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}]}},
              [{gen_server,terminate,6},
              {proc_lib,init_p_do_apply,3}]}},
              {ancestors,[<0.565.0>]},
              {messages,[]},
              {links,[<0.126.0>]},
              {dictionary,[]},
              {trap_exit,true},
              {status,running},
              {heap_size,1597},
              {stack_size,24},
              {reductions,340978}],
              []]}}
              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.940.0>] ** Generic server <0.940.0> terminating
              ** Last message in was {'EXIT',<0.939.0>,
              {badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]}}
              ** When Server state == {file,{file_descriptor,prim_file,{#Port<0.2991>,126}},
              578560125}
              ** Reason for termination ==
              ** {badarg,[{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]}

              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.940.0>] {error_report,<0.32.0>,
              {<0.940.0>,crash_report,
              [[{initial_call,{couch_file,init,['Argument__1']}},
              {pid,<0.940.0>},
              {registered_name,[]},
              {error_info,
              {exit,
              {badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              [{gen_server,terminate,6},
              {proc_lib,init_p_do_apply,3}]}},
              {ancestors,[<0.939.0>,<0.938.0>]},
              {messages,[{'EXIT',<0.942.0>,shutdown}]},
              {links,[]},
              {dictionary,[]},
              {trap_exit,true},
              {status,running},
              {heap_size,377},
              {stack_size,24},
              {reductions,165720368}],
              []]}}
              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.24781.62>] ** Generic server <0.24781.62> terminating
              ** Last message in was {'EXIT',<0.24784.62>,
              {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,
              59,95,56>>,
              <0.24781.62>,<<"_design/events">>,
              <<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc) {n emit([doc.date, doc.facility], null);n}

              ">>,
              {btree,<0.24781.62>,

              Unknown macro: {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],
              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}]}}}
              ** When Server state == {file,
              {file_descriptor,prim_file,{#Port<0.851695>,121}},
              308297853}
              ** Reason for termination ==
              ** {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,95,56>>,
              <0.24781.62>,<<"_design/events">>,<<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438}

              ,
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,snappy},
              [],[]}],

              {[]},
              {btree,<0.24781.62>,
              {308292754,[],152742682},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_btree.5.93788370>,nil,snappy},
              6473119,0,nil,nil}}]}}

              [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.24781.62>] {error_report,<0.32.0>,
              {<0.24781.62>,crash_report,
              [[{initial_call,{couch_file,init,['Argument__1']}},
              {pid,<0.24781.62>},
              {registered_name,[]},
              {error_info,
              {exit,
              {{badarg,
              [{erlang,unlink,[nil]},
              {couch_view_group,handle_call,3},
              {gen_server,handle_msg,5},
              {proc_lib,init_p_do_apply,3}]},
              {gen_server,call,
              [<0.939.0>,
              {compact_done,
              {group,
              <<255,131,167,66,53,46,245,243,22,92,235,227,
              135,59,95,56>>,
              <0.24781.62>,<<"_design/events">>,
              <<"javascript">>,[],
              [{view,0,6473119,0,
              [<<"list">>],
              <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>,
              {btree,<0.24781.62>,
              {308295893,{5879466,[]},154301438},
              #Fun<couch_btree.3.71804109>,
              #Fun<couch_btree.4.115144917>,
              #Fun<couch_view.less_json_ids.2>,
              #Fun<couch_view_group.10.26766604>,
              snappy},
              [],[]}],
              {[]}

              ,

              Unknown macro: {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}

              ,
              6473119,0,nil,nil}}]}},
              [

              {gen_server,terminate,6}

              ,

              {proc_lib,init_p_do_apply,3}

              ]}},

              {ancestors,[<0.566.0>,<0.565.0>]}

              ,

              {messages,[]}

              ,

              {links,[]}

              ,

              {dictionary,[]}

              ,

              {trap_exit,true}

              ,

              {status,running}

              ,

              {heap_size,1597}

              ,

              {stack_size,24}

              ,

              {reductions,120074193}

              ],
              []]}}

          And, one more surprise: original view is accessible:

          [Tue, 24 Jul 2012 07:59:31 GMT] [info] [<0.23744.62>] 10.0.0.41 - - GET /exhaust_events/_design/driver/_view/events?limit=11&reduce=false 200

          So, as I think, couchdb view indexer mixed some view data inside.

          More info:

          I have 3 nodes with identical databases.
          But problem occured only on one node.

          Problem node running couchdb 1.2.0 and erlang 13.2.4
          Working fine nodes running couchdb 1.2.0 and erlang 14.2.1

          I'm going to upgrade erlang on problem node today and report, if problem gone away.

          Show
          Alexey Loshkarev added a comment - I'm encounting such problem almost every day on my couchdb 1.2.0 installation. The problem is in 2 databases with 8-10 design docs there. All this databases configured to be compacted every day: exhaust_orders = [ {db_fragmentation, "50%"} , {view_fragmentation, "30%"}] exhaust = [{db_fragmentation, "30%"}, {view_fragmentation, "30%"} ] My last logged problem description: First - I received notification from cron-daemon, doing view indexing every 1 minute for every databases (pinger). Notification is about "missing_named_view" There are log message (couchdb.log): [Tue, 24 Jul 2012 07:37:34 GMT] [info] [<0.24555.62>] 10.0.0.41 - - GET /exhaust_events/_design/driver/_view/events?limit=11&reduce=false 404 I opened to Futon page with this view and press "compact view" button. Surprisingly, view compaction started for another view, not mine! Here are running process: View compaction exhaust_events, _design/events 2012-07-24 10:55:47 2012-07-24 10:56:07 <0.24784.62> Progress 12% But couchdb.log show me, as compaction was started for correct view: [Tue, 24 Jul 2012 07:55:47 GMT] [info] [<0.24555.62>] 10.0.0.41 - - POST /exhaust_events/_compact/driver 202 No other compaction requests in log for that time period. Latelly, at compaction end, I see in couchdb.log: [Tue, 24 Jul 2012 07:58:39 GMT] [info] [<0.939.0>] View index compaction still behind for exhaust_events _design/events – current: 6473109 compact: 6473081 [Tue, 24 Jul 2012 07:58:39 GMT] [info] [<0.939.0>] View index compaction complete for exhaust_events _design/events [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.939.0>] ** Generic server <0.939.0> terminating Last message in was {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59, 95,56>>, <0.24781.62>,<<"_design/events">>,<<"javascript">>, [], [ Unknown macro: {view,0,6473119,0, [<<"list">>], <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}} ** When Server state == {group_state,undefined,<<"exhaust_events">>, {"/var/lib/couchdb",<<"exhaust_events">>, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59, 95,56>>, nil,<<"_design/events">>,<<"javascript">>,[], [{view,0,0,0, [<<"list">>] , <<"function(doc) {n emit([doc.date, doc.facility], null);n}">>, nil,[],[]} ], {[]}, nil,0,0,nil,nil}}, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59, 95,56>>, <0.940.0>,<<"_design/events">>,<<"javascript">>,[], [{view,0,6473109,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.940.0>, {578555333,{5879456,[]},161851847}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]} , Unknown macro: {btree,<0.940.0>, {578551416,[],157784808}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy} , 6473109,0,nil,nil}, nil,nil,false,[],<0.942.0>,false} Reason for termination == Unknown macro: {badarg,[{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.939.0>] {error_report,<0.32.0>, {<0.939.0>,crash_report, [[{initial_call, {couch_view_group,init, ['Argument__1'] }}, {pid,<0.939.0>}, {registered_name,[]}, {error_info, {exit, {badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, [{gen_server,terminate,6}, {proc_lib,init_p_do_apply,3}]}}, {ancestors,[<0.938.0>]}, {messages,[]}, {links,[<0.940.0>,<0.126.0>]}, {dictionary,[]}, {trap_exit,true}, {status,running}, {heap_size,2584}, {stack_size,24}, {reductions,324488}], []]}} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.566.0>] ** Generic server <0.566.0> terminating ** Last message in was {'EXIT',<0.24784.62>, {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135, 59,95,56>>, <0.24781.62>,<<"_design/events">>, <<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}}]}}} ** When Server state == {group_state,undefined,<<"exhaust_events">>, {"/var/lib/couchdb",<<"exhaust_events">>, {group, <<52,56,143,106,60,215,14,14,141,162,67,151,219,113, 2,81>>, nil,<<"_design/driver">>,<<"javascript">>,[], [{view,0,0,0,[], <<"function(doc) {\n\tif(doc.driver){\n \temit([doc.driver, doc.date.slice(0, 10), doc.date.slice(11, 19)], 1);\n\t}\n}">>, nil, [{<<"events">>, <<"function(keys, values) {\n return sum(values);\n}">>}], []}], {[]}, nil,0,0,nil,nil}}, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59, 95,56>>, <0.940.0>,<<"_design/events">>,<<"javascript">>,[], [{view,0,6473109,0, [<<"list">>] , <<"function(doc) {n emit([doc.date, doc.facility], null);n}">>, {btree,<0.940.0>, {578548274,{5879456,[]},161852193}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.940.0>, {578544609,[],157785117}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473109,0,nil,nil}, nil,<0.24784.62>,false,[],<0.17570.0>,false} ** Reason for termination == ** {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,95,56>>, <0.24781.62>,<<"_design/events">>,<<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}}]}} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.566.0>] {error_report,<0.32.0>, {<0.566.0>,crash_report, [[{initial_call,{couch_view_group,init, ['Argument__1'] }}, {pid,<0.566.0>}, {registered_name,[]}, {error_info, {exit, {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} , {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227, 135,59,95,56>>, <0.24781.62>,<<"_design/events">>, <<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}}]}}, [{gen_server,terminate,6}, {proc_lib,init_p_do_apply,3}]}}, {ancestors,[<0.565.0>]}, {messages,[]}, {links,[<0.126.0>]}, {dictionary,[]}, {trap_exit,true}, {status,running}, {heap_size,1597}, {stack_size,24}, {reductions,340978}], []]}} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.940.0>] ** Generic server <0.940.0> terminating ** Last message in was {'EXIT',<0.939.0>, {badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}} ** When Server state == {file,{file_descriptor,prim_file,{#Port<0.2991>,126}}, 578560125} ** Reason for termination == ** {badarg,[{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.940.0>] {error_report,<0.32.0>, {<0.940.0>,crash_report, [[{initial_call,{couch_file,init, ['Argument__1'] }}, {pid,<0.940.0>}, {registered_name,[]}, {error_info, {exit, {badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, [{gen_server,terminate,6}, {proc_lib,init_p_do_apply,3}]}}, {ancestors,[<0.939.0>,<0.938.0>]}, {messages, [{'EXIT',<0.942.0>,shutdown}] }, {links,[]}, {dictionary,[]}, {trap_exit,true}, {status,running}, {heap_size,377}, {stack_size,24}, {reductions,165720368}], []]}} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.24781.62>] ** Generic server <0.24781.62> terminating ** Last message in was {'EXIT',<0.24784.62>, {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135, 59,95,56>>, <0.24781.62>,<<"_design/events">>, <<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {n emit([doc.date, doc.facility], null);n} ">>, {btree,<0.24781.62>, Unknown macro: {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}}]}}} ** When Server state == {file, {file_descriptor,prim_file,{#Port<0.851695>,121}}, 308297853} ** Reason for termination == ** {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227,135,59,95,56>>, <0.24781.62>,<<"_design/events">>,<<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438} , #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]}, {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy}, 6473119,0,nil,nil}}]}} [Tue, 24 Jul 2012 07:58:39 GMT] [error] [<0.24781.62>] {error_report,<0.32.0>, {<0.24781.62>,crash_report, [[{initial_call,{couch_file,init, ['Argument__1'] }}, {pid,<0.24781.62>}, {registered_name,[]}, {error_info, {exit, {{badarg, [{erlang,unlink,[nil]}, {couch_view_group,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.939.0>, {compact_done, {group, <<255,131,167,66,53,46,245,243,22,92,235,227, 135,59,95,56>>, <0.24781.62>,<<"_design/events">>, <<"javascript">>,[], [{view,0,6473119,0, [<<"list">>] , <<"function(doc) {\n emit([doc.date, doc.facility], null);\n}">>, {btree,<0.24781.62>, {308295893,{5879466,[]},154301438}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>, snappy}, [],[]}], {[]} , Unknown macro: {btree,<0.24781.62>, {308292754,[],152742682}, #Fun<couch_btree.3.71804109>, #Fun<couch_btree.4.115144917>, #Fun<couch_btree.5.93788370>,nil,snappy} , 6473119,0,nil,nil}}]}}, [ {gen_server,terminate,6} , {proc_lib,init_p_do_apply,3} ]}}, {ancestors,[<0.566.0>,<0.565.0>]} , {messages,[]} , {links,[]} , {dictionary,[]} , {trap_exit,true} , {status,running} , {heap_size,1597} , {stack_size,24} , {reductions,120074193} ], []]}} And, one more surprise: original view is accessible: [Tue, 24 Jul 2012 07:59:31 GMT] [info] [<0.23744.62>] 10.0.0.41 - - GET /exhaust_events/_design/driver/_view/events?limit=11&reduce=false 200 So, as I think, couchdb view indexer mixed some view data inside. More info: I have 3 nodes with identical databases. But problem occured only on one node. Problem node running couchdb 1.2.0 and erlang 13.2.4 Working fine nodes running couchdb 1.2.0 and erlang 14.2.1 I'm going to upgrade erlang on problem node today and report, if problem gone away.
          Hide
          Stefan Kögl added a comment -

          Another machine appears to have the problem regularly. This time it's rather decent hardware. I noted the following facts about my cases

          • the problem doesn't seem to depend on high or low load, but it appears at least once a day, going away only after a restart of Couch
          • Couch is not running out of file descriptors (netstat shows ~70 connections, with 1024 file descriptors available to the process in total)
          • memory usage is not particularly high (beam.smp taking <1% of the available memory)
          • CPU usage is about 3 of 8 cores
          • Couch is running on R15B01
          • iostat shows disk utilization <20%
          • the filesystem is not running out of space (>50% free)
          • no exceptions in the log (so far)

          Any ideas what is necessary to debug this?

          Show
          Stefan Kögl added a comment - Another machine appears to have the problem regularly. This time it's rather decent hardware. I noted the following facts about my cases the problem doesn't seem to depend on high or low load, but it appears at least once a day, going away only after a restart of Couch Couch is not running out of file descriptors (netstat shows ~70 connections, with 1024 file descriptors available to the process in total) memory usage is not particularly high (beam.smp taking <1% of the available memory) CPU usage is about 3 of 8 cores Couch is running on R15B01 iostat shows disk utilization <20% the filesystem is not running out of space (>50% free) no exceptions in the log (so far) Any ideas what is necessary to debug this?
          Hide
          Robert Newson added a comment -

          I have some more news. Stefan was gracious enough to open distributed erlang on his server and grant me ssh access to the box. the error arose today and I've looked inside the erlang vm.

          What's happening is the view_group process has, somehow, got the wrong design document fragment inside it. We look up the view group by signature and then ask that process to match on the view name.

          The group_state for "_design/podcastlists"' init_args is correct (has "by_user_slug" and "by_rating"), the group_state's group member, however, is for a completely different design document ("_design/chapters").

          I proved this mismatch comprehensively;

          Here's the design document, with only "by_user_slug" and "by_rating";

          rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/podcastlists"
          {"_id":"_design/podcastlists","_rev":"13-2faa3f672e06eae3c727fec5f317b65c","views":{"by_user_slug":{"map":"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n

          {\n emit([doc.user, doc.slug], null);\n }

          \n}"},"by_rating":{"map":"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n if(doc.podcasts == null || doc.podcasts.length == 0)\n

          {\n return;\n }

          \n\n var rating = 0;\n for(var n in doc.ratings)\n

          {\n rating += doc.ratings[n].rating;\n }

          \n\n if (rating >= 0)\n

          {\n emit(rating, null);\n }

          \n }\n}"}},"couchapp":{"signatures":{},"objects":{},"manifest":["views/","views/by_user_slug/","views/by_user_slug/map.js","views/by_rating/","views/by_rating/map.js"]}}

          But "by_rating" gives a 404, missing_named_view;

          rnewson@foo:~curl "http://127.0.0.1:5984/mygpo/_design/podcastlists/_view/by_rating?limit=0"

          {"error":"not_found","reason":"missing_named_view"}

          And "by_episode" which is not present in "_design/podcastlists" but is present in "_design/chapters" is found;
          rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/podcastlists/_view/by_episode?limit=0"

          {"total_rows":253,"offset":0,"rows":[]}

          And here's "_design/chapters";
          rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/chapters"
          {"_id":"_design/chapters","_rev":"13-0106a7e9d80b9318d6c38370cf197a42","views":{"by_episode":{"map":"function(doc)\n{\n if(doc.doc_type == \"EpisodeUserState\")\n {\n for(var n in doc.chapters)\n

          {\n var chapter = doc.chapters[n];\n emit([doc.episode, doc.user], chapter);\n }

          \n }\n}"}},"couchapp":{"signatures":{},"objects":{},"manifest":["views/","views/by_episode/","views/by_episode/map.js"]}}

          Show
          Robert Newson added a comment - I have some more news. Stefan was gracious enough to open distributed erlang on his server and grant me ssh access to the box. the error arose today and I've looked inside the erlang vm. What's happening is the view_group process has, somehow, got the wrong design document fragment inside it. We look up the view group by signature and then ask that process to match on the view name. The group_state for "_design/podcastlists"' init_args is correct (has "by_user_slug" and "by_rating"), the group_state's group member, however, is for a completely different design document ("_design/chapters"). I proved this mismatch comprehensively; Here's the design document, with only "by_user_slug" and "by_rating"; rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/podcastlists" {"_id":"_design/podcastlists","_rev":"13-2faa3f672e06eae3c727fec5f317b65c","views":{"by_user_slug":{"map":"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n emit([doc.user, doc.slug], null);\n } \n}"},"by_rating":{"map":"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n if(doc.podcasts == null || doc.podcasts.length == 0)\n {\n return;\n } \n\n var rating = 0;\n for(var n in doc.ratings)\n {\n rating += doc.ratings[n].rating;\n } \n\n if (rating >= 0)\n {\n emit(rating, null);\n } \n }\n}"}},"couchapp":{"signatures":{},"objects":{},"manifest": ["views/","views/by_user_slug/","views/by_user_slug/map.js","views/by_rating/","views/by_rating/map.js"] }} But "by_rating" gives a 404, missing_named_view; rnewson@foo:~curl "http://127.0.0.1:5984/mygpo/_design/podcastlists/_view/by_rating?limit=0" {"error":"not_found","reason":"missing_named_view"} And "by_episode" which is not present in "_design/podcastlists" but is present in "_design/chapters" is found; rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/podcastlists/_view/by_episode?limit=0" {"total_rows":253,"offset":0,"rows":[]} And here's "_design/chapters"; rnewson@foo:~$ curl "http://127.0.0.1:5984/mygpo/_design/chapters" {"_id":"_design/chapters","_rev":"13-0106a7e9d80b9318d6c38370cf197a42","views":{"by_episode":{"map":"function(doc)\n{\n if(doc.doc_type == \"EpisodeUserState\")\n {\n for(var n in doc.chapters)\n {\n var chapter = doc.chapters[n];\n emit([doc.episode, doc.user], chapter);\n } \n }\n}"}},"couchapp":{"signatures":{},"objects":{},"manifest": ["views/","views/by_episode/","views/by_episode/map.js"] }}
          Hide
          Robert Newson added a comment -

          The full group_state for other devs;

          (couchdb@127.0.0.1)9> io:format("~p~n", [sys:get_status(pid(0,266,0))]).
          {status,<4665.266.0>,

          {module,gen_server}

          ,
          [[

          {'$ancestors',[<4665.265.0>]}

          ,
          {'$initial_call',{couch_view_group,init,1}}],
          running,<4665.265.0>,[],
          [

          {header,"Status for generic server <0.266.0>"}

          ,
          {data,[

          {"Status",running}

          ,

          {"Parent",<4665.265.0>}

          ,

          {"Logged events",[]}

          ]},
          {data,
          [{"State",
          {group_state,undefined,<<"mygpo">>,
          {"/var/lib/couchdb",<<"mygpo">>,
          {group,
          <<191,164,86,206,176,78,160,77,123,154,251,56,80,36,128,
          157>>,
          nil,<<"_design/podcastlists">>,<<"javascript">>,[],
          [{view,0,0,0,
          [<<"by_user_slug">>],
          <<"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n

          {\n emit([doc.user, doc.slug], null);\n }

          \n}">>,
          nil,[],[]},
          {view,1,0,0,
          [<<"by_rating">>],
          <<"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n if(doc.podcasts == null || doc.podcasts.length == 0)\n

          {\n return;\n }

          \n\n var rating = 0;\n for(var n in doc.ratings)\n

          {\n rating += doc.ratings[n].rating;\n }

          \n\n if (rating >= 0)\n

          {\n emit(rating, null);\n }

          \n }\n}">>,
          nil,[],[]}],

          {[]},
          nil,0,0,nil,nil}},
          {group,
          <<42,63,93,206,0,242,69,90,79,8,224,111,56,142,84,60>>,
          <4665.213.0>,<<"_design/chapters">>,<<"javascript">>,[],
          [{view,0,0,0,
          [<<"by_episode">>],
          <<"function(doc)\n{\n if(doc.doc_type == \"EpisodeUserState\")\n {\n for(var n in doc.chapters)\n {\n var chapter = doc.chapters[n];\n emit([doc.episode, doc.user], chapter);\n }\n }\n}">>,
          {btree,<4665.213.0>,
          {18823,{253,[]},13861},
          #Fun<couch_btree.3.133731799>,
          #Fun<couch_btree.4.133731799>,
          #Fun<couch_view.less_json_ids.2>,
          #Fun<couch_view_group.10.26766604>,snappy},
          [],[]}],
          {[]}

          ,
          {btree,<4665.213.0>,

          {4921,[],5317}

          ,
          #Fun<couch_btree.3.133731799>,
          #Fun<couch_btree.4.133731799>,
          #Fun<couch_btree.5.133731799>,nil,snappy},
          180079624,0,nil,nil},
          nil,nil,false,[],<4665.269.0>,false}}]}]]}
          ok

          Show
          Robert Newson added a comment - The full group_state for other devs; (couchdb@127.0.0.1)9> io:format("~p~n", [sys:get_status(pid(0,266,0))] ). {status,<4665.266.0>, {module,gen_server} , [[ {'$ancestors',[<4665.265.0>]} , {'$initial_call',{couch_view_group,init,1}}], running,<4665.265.0>,[], [ {header,"Status for generic server <0.266.0>"} , {data,[ {"Status",running} , {"Parent",<4665.265.0>} , {"Logged events",[]} ]}, {data, [{"State", {group_state,undefined,<<"mygpo">>, {"/var/lib/couchdb",<<"mygpo">>, {group, <<191,164,86,206,176,78,160,77,123,154,251,56,80,36,128, 157>>, nil,<<"_design/podcastlists">>,<<"javascript">>,[], [{view,0,0,0, [<<"by_user_slug">>] , <<"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n emit([doc.user, doc.slug], null);\n } \n}">>, nil,[],[]}, {view,1,0,0, [<<"by_rating">>] , <<"function(doc)\n{\n if(doc.doc_type == \"PodcastList\")\n {\n if(doc.podcasts == null || doc.podcasts.length == 0)\n {\n return;\n } \n\n var rating = 0;\n for(var n in doc.ratings)\n {\n rating += doc.ratings[n].rating;\n } \n\n if (rating >= 0)\n {\n emit(rating, null);\n } \n }\n}">>, nil,[],[]}], {[]}, nil,0,0,nil,nil}}, {group, <<42,63,93,206,0,242,69,90,79,8,224,111,56,142,84,60>>, <4665.213.0>,<<"_design/chapters">>,<<"javascript">>,[], [{view,0,0,0, [<<"by_episode">>] , <<"function(doc)\n{\n if(doc.doc_type == \"EpisodeUserState\")\n {\n for(var n in doc.chapters)\n {\n var chapter = doc.chapters[n];\n emit([doc.episode, doc.user], chapter);\n }\n }\n}">>, {btree,<4665.213.0>, {18823,{253,[]},13861}, #Fun<couch_btree.3.133731799>, #Fun<couch_btree.4.133731799>, #Fun<couch_view.less_json_ids.2>, #Fun<couch_view_group.10.26766604>,snappy}, [],[]}], {[]} , {btree,<4665.213.0>, {4921,[],5317} , #Fun<couch_btree.3.133731799>, #Fun<couch_btree.4.133731799>, #Fun<couch_btree.5.133731799>,nil,snappy}, 180079624,0,nil,nil}, nil,nil,false,[],<4665.269.0>,false}}]}]]} ok
          Hide
          Robert Newson added a comment -

          this was all on 1.2.0

          Show
          Robert Newson added a comment - this was all on 1.2.0
          Hide
          Paul Joseph Davis added a comment -

          Applied the fix tested by Stefan to 1.2.x. Ported the fix to master as well. Do we care about 1.1.x?

          Show
          Paul Joseph Davis added a comment - Applied the fix tested by Stefan to 1.2.x. Ported the fix to master as well. Do we care about 1.1.x?
          Hide
          Calle Arnesten added a comment -

          I had the same problem today in production (CouchDB 1.2.0). It happened during a replication of about ~500k documents. It was a pull-based replication and it was the source server that had the problem. The replication process had been going for about 1 hour when it occured. A restart solved it.

          Show
          Calle Arnesten added a comment - I had the same problem today in production (CouchDB 1.2.0). It happened during a replication of about ~500k documents. It was a pull-based replication and it was the source server that had the problem. The replication process had been going for about 1 hour when it occured. A restart solved it.
          Hide
          Calle Arnesten added a comment -

          @Robert Newson: Great to see that it is fixed in 1.2.1. Thanks a lot!

          Until 1.2.1 is released, is there any special care one could take to not trigger this bug? In other words, what are the conditions for triggering the bug?

          Show
          Calle Arnesten added a comment - @Robert Newson: Great to see that it is fixed in 1.2.1. Thanks a lot! Until 1.2.1 is released, is there any special care one could take to not trigger this bug? In other words, what are the conditions for triggering the bug?
          Hide
          Robert Newson added a comment -

          I marked it as fixed in 1.2.1 to reflect the fact that the fix is on the 1.2.x branch in git. I don't think there will be a 1.2.1 release, the next release should be 1.3.0 (which also includes the fix).

          The only thing you can do to avoid the issue is not to use views at all, so I would apply the patch if you need it sooner than our next release.

          Show
          Robert Newson added a comment - I marked it as fixed in 1.2.1 to reflect the fact that the fix is on the 1.2.x branch in git. I don't think there will be a 1.2.1 release, the next release should be 1.3.0 (which also includes the fix). The only thing you can do to avoid the issue is not to use views at all, so I would apply the patch if you need it sooner than our next release.

            People

            • Assignee:
              Unassigned
              Reporter:
              Sam Lown
            • Votes:
              3 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development