Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15783

MDC Values can leak between request threads, trace_id currently does

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 9.0
    • None
    • None

    Description

      Currently, the only general "MDC Cleanup" code that happens from request to request is done by MDCLoggingContext.clear() and MDCLoggingContext.reset() ... but these methods (in addition to only being concerned with a static list of 5 specific MDC values) are currently designed and used in such a way that things like "trace_id" are only "reset" at the begining of HttpSolrCall.call() ... which means any request specific log messages that happen before that point (like authn errors) will incorrectly refer to the trace_id of whatever request the thread previously processed.

      As a trivial example of this, if you apply a patch like...

      diff --git a/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java b/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java
      index 5f481c48e92..97c864617f4 100644
      --- a/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java
      +++ b/solr/core/src/java/org/apache/solr/client/solrj/embedded/JettySolrRunner.java
      @@ -109,7 +109,7 @@ public class JettySolrRunner {
       
         private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
       
      -  private static final int THREAD_POOL_MAX_THREADS = 10000;
      +  private static final int THREAD_POOL_MAX_THREADS = 10; // nocommit
         // NOTE: needs to be larger than SolrHttpClient.threadPoolSweeperMaxIdleTime
         private static final int THREAD_POOL_MAX_IDLE_TIME_MS = 260000;
       
      diff --git a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
      index 28167270099..fa6bb9a4eb8 100644
      --- a/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
      +++ b/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
      @@ -484,6 +484,9 @@ public class SolrDispatchFilter extends BaseSolrFilter {
               response.sendError(429, errorMessage);
             }
       
      +      log.error("nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims={} ; real_traceId={}",
      +                org.slf4j.MDC.get(org.apache.solr.logging.MDCLoggingContext.TRACE_ID), span.context().toTraceId());
      +      
             AtomicReference<HttpServletRequest> wrappedRequest = new AtomicReference<>();
             if (!authenticateRequest(request, response, wrappedRequest)) { // the response and status code have already been sent
               return;
      

      ...and then run ./gradlew test --tests TestDistributedTracing -Ptests.verbose=true you can pick out any single jetty query thread and see something like...

        2> 4103 ERROR (qtp1310066330-70) [] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=null ; real_traceId=1
        2> 6146 ERROR (qtp1310066330-70) [n:127.0.0.1:34135_solr t:1] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=1 ; real_traceId=22
        2> 6661 ERROR (qtp1310066330-70) [n:127.0.0.1:34135_solr t:22] o.a.s.s.SolrDispatchFilter nocommit: fake log message before HttpSolrCall.call() as if auth error happened: MDC_claims=22 ; real_traceId=50
      

      (Note also that during authn, the nodeId isn't populated for a 'new' thread, it's only set if it was leaked from the last request processed)

      Attachments

        1. SOLR-15783.patch
          7 kB
          Chris M. Hostetter

        Issue Links

          Activity

            People

              hossman Chris M. Hostetter
              hossman Chris M. Hostetter
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: