Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-761

Timeouts in couch_log are masked, crashes callers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.10.1, 0.10.2, 0.11
    • 0.11.1, 1.0, 1.0.3
    • Database Core
    • None

    Description

      Several users have reported seeing crash reports stemming from a function_clause match on handle_info in various gen_servers. The offending message looks like

      {#Ref<>, <integer>}

      .

      After months of banter and sleuthing, I determined that the likely cause was a late reply to a gen_server:call that timed out, with the #Ref being the tag on the response. After it came up again today in IRC, kocolosk quickly discovered that the problem appears to be in couch_log.erl.

      The logging macros (?LOG_) call couch_log/_on which calls get_level_integer/0. When this call times out the timeout is eaten and a late reply arrives to the calling process later, triggering the crash.

      Suggestions on how to fix this welcome. Ideas so far are async logging or infinite timeout.

      Attachments

        1. improved-sync-logging-v2.patch
          5 kB
          Randall Leeds
        2. improved-sync-logging.patch
          2 kB
          Randall Leeds

        Activity

          People

            Unassigned Unassigned
            tilgovi Randall Leeds
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: