Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-761

Timeouts in couch_log are masked, crashes callers

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.10.1, 0.10.2, 0.11
    • Fix Version/s: 0.11.1, 1.0, 1.0.3
    • Component/s: Database Core
    • Labels:
      None

      Description

      Several users have reported seeing crash reports stemming from a function_clause match on handle_info in various gen_servers. The offending message looks like

      {#Ref<>, <integer>}

      .

      After months of banter and sleuthing, I determined that the likely cause was a late reply to a gen_server:call that timed out, with the #Ref being the tag on the response. After it came up again today in IRC, kocolosk quickly discovered that the problem appears to be in couch_log.erl.

      The logging macros (?LOG_) call couch_log/_on which calls get_level_integer/0. When this call times out the timeout is eaten and a late reply arrives to the calling process later, triggering the crash.

      Suggestions on how to fix this welcome. Ideas so far are async logging or infinite timeout.

        Attachments

        1. improved-sync-logging-v2.patch
          5 kB
          Randall Leeds
        2. improved-sync-logging.patch
          2 kB
          Randall Leeds

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tilgovi Randall Leeds
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: