Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11413

SolrGraphiteReporter fails to report metrics due to non-thread safe code

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.6, 7.0
    • 7.2, 8.0
    • metrics
    • None

    Description

      Symptom:
      Intermittent errors writing graphite metrics. Errors indicate use of sockets which have already been closed.

      Cause:
      SolrGraphiteReporter caches and shares dropwizard Graphite instances. These reporters are not thread safe as they open and close an instance variable of type GraphiteSender. On modern bare metal hardware this problem was observed consistently, and resulted in the majority of metrics failing to be delivered to graphite.

      Proposed Fix:
      Graphite (and PickledGraphite) are not designed to be cached, and should not be.

      Test:
      Patch file includes test which forces error.

      Alternative Fixes Considered:

      • Totally change solr metrics architecture to use a single metrics registry - seems undesirable and impractical
      • Create a synchronized or otherwise thread-safe implementation of dropwizard graphite reporter - should be fixed upstream in dropwizard and not obviously preferred to current model

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ab Andrzej Bialecki
            erikpersson Erik Persson
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment