Solr
  1. Solr
  2. SOLR-5691

Unsynchronized WeakHashMap in SolrDispatchFilter causing issues in SolrCloud

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.6.1
    • Fix Version/s: 4.7, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

      I have a large SolrCloud setup, 7 nodes, each hosting few 1000 cores (leaders/replicas of same shard exist on different nodes), which is maybe making it easier to notice the problem.

      Node can randomly get into a state where it "stops" responding to PeerSync /get requests from other nodes. When that happens, threaddump of that node shows multiple entries like this one (one entry for each "blocked" request from other node; they don't go away with time):

      "http-bio-8080-exec-1781" daemon prio=5 tid=0x440177200000 nid=0x25ae [ JVM locked by VM at safepoint, polling bits: safep ]
      java.lang.Thread.State: RUNNABLE
      at java.util.WeakHashMap.get(WeakHashMap.java:471)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
      at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)
      at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
      at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

      WeakHashMap's internal state can easily get corrupted when used in unsynchronized way, in which case it is known to enter infinite loop in .get() call. It is very likely that this happens here too. The reason why other maybe don't see this issue could be related to huge number of cores I have in this system. The problem is usually created when some node is starting. Also, it doesn't happen with each start, it obviously depends on "correct" timing of events which lead to map's corruption.

      The fix may be as simple as changing:

      protected final Map<SolrConfig, SolrRequestParsers> parsers = new WeakHashMap<SolrConfig, SolrRequestParsers>();

      to:

      protected final Map<SolrConfig, SolrRequestParsers> parsers = Collections.synchronizedMap(
      new WeakHashMap<SolrConfig, SolrRequestParsers>());

      but there may be performance considerations around this since it is entrance into Solr.

        Activity

        Hide
        Mark Miller added a comment -

        Hmm...what if we put the parser on the solrconfig object as a volatile? Volatiles that don't change are very fast and we avoid the ugly weakhashmap altogether:

        Index: solr/core/src/java/org/apache/solr/core/SolrConfig.java
        ===================================================================
        --- solr/core/src/java/org/apache/solr/core/SolrConfig.java	(revision 1564426)
        +++ solr/core/src/java/org/apache/solr/core/SolrConfig.java	(working copy)
        @@ -18,6 +18,7 @@
         package org.apache.solr.core;
         
         import static org.apache.solr.core.SolrConfig.PluginOpts.*;
        +
         import org.apache.solr.common.SolrException;
         import org.apache.solr.common.SolrException.ErrorCode;
         import org.apache.solr.schema.IndexSchemaFactory;
        @@ -28,11 +29,11 @@
         import org.apache.solr.request.SolrRequestHandler;
         import org.apache.solr.response.QueryResponseWriter;
         import org.apache.solr.response.transform.TransformerFactory;
        -
         import org.apache.solr.search.CacheConfig;
         import org.apache.solr.search.FastLRUCache;
         import org.apache.solr.search.QParserPlugin;
         import org.apache.solr.search.ValueSourceParser;
        +import org.apache.solr.servlet.SolrRequestParsers;
         import org.apache.solr.update.SolrIndexConfig;
         import org.apache.solr.update.UpdateLog;
         import org.apache.solr.update.processor.UpdateRequestProcessorChain;
        @@ -40,10 +41,8 @@
         import org.apache.lucene.search.BooleanQuery;
         import org.apache.lucene.index.IndexDeletionPolicy;
         import org.apache.lucene.util.Version;
        -
         import org.slf4j.Logger;
         import org.slf4j.LoggerFactory;
        -
         import org.w3c.dom.Node;
         import org.w3c.dom.NodeList;
         import org.xml.sax.InputSource;
        @@ -89,6 +88,8 @@
           private boolean handleSelect;
         
           private boolean addHttpRequestToContext;
        +
        +  private volatile SolrRequestParsers solrRequestParsers;
           
           /** Creates a default instance from the solrconfig.xml. */
           public SolrConfig()
        @@ -324,6 +325,13 @@
             }
             return result;
           }
        +  
        +  public SolrRequestParsers getRequestParsers() {
        +    if (solrRequestParsers == null) {
        +      solrRequestParsers = new SolrRequestParsers(this);
        +    }
        +    return solrRequestParsers;
        +  }
         
           /* The set of materialized parameters: */
           public final int booleanQueryMaxClauseCount;
        Index: solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java
        ===================================================================
        --- solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java	(revision 1564426)
        +++ solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java	(working copy)
        @@ -103,7 +103,6 @@
         
           protected String pathPrefix = null; // strip this from the beginning of a path
           protected String abortErrorMessage = null;
        -  protected final Map<SolrConfig, SolrRequestParsers> parsers = new WeakHashMap<SolrConfig, SolrRequestParsers>();
           
           private static final Charset UTF8 = Charset.forName("UTF-8");
         
        @@ -348,12 +347,7 @@
                 if( core != null ) {
                   final SolrConfig config = core.getSolrConfig();
                   // get or create/cache the parser for the core
        -          SolrRequestParsers parser = null;
        -          parser = parsers.get(config);
        -          if( parser == null ) {
        -            parser = new SolrRequestParsers(config);
        -            parsers.put(config, parser );
        -          }
        +          SolrRequestParsers parser = config.getRequestParsers();
         
                   // Handle /schema/* paths via Restlet
                   if( path.startsWith("/schema") ) {
        
        Show
        Mark Miller added a comment - Hmm...what if we put the parser on the solrconfig object as a volatile? Volatiles that don't change are very fast and we avoid the ugly weakhashmap altogether: Index: solr/core/src/java/org/apache/solr/core/SolrConfig.java =================================================================== --- solr/core/src/java/org/apache/solr/core/SolrConfig.java (revision 1564426) +++ solr/core/src/java/org/apache/solr/core/SolrConfig.java (working copy) @@ -18,6 +18,7 @@ package org.apache.solr.core; import static org.apache.solr.core.SolrConfig.PluginOpts.*; + import org.apache.solr.common.SolrException; import org.apache.solr.common.SolrException.ErrorCode; import org.apache.solr.schema.IndexSchemaFactory; @@ -28,11 +29,11 @@ import org.apache.solr.request.SolrRequestHandler; import org.apache.solr.response.QueryResponseWriter; import org.apache.solr.response.transform.TransformerFactory; - import org.apache.solr.search.CacheConfig; import org.apache.solr.search.FastLRUCache; import org.apache.solr.search.QParserPlugin; import org.apache.solr.search.ValueSourceParser; +import org.apache.solr.servlet.SolrRequestParsers; import org.apache.solr.update.SolrIndexConfig; import org.apache.solr.update.UpdateLog; import org.apache.solr.update.processor.UpdateRequestProcessorChain; @@ -40,10 +41,8 @@ import org.apache.lucene.search.BooleanQuery; import org.apache.lucene.index.IndexDeletionPolicy; import org.apache.lucene.util.Version; - import org.slf4j.Logger; import org.slf4j.LoggerFactory; - import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.InputSource; @@ -89,6 +88,8 @@ private boolean handleSelect; private boolean addHttpRequestToContext; + + private volatile SolrRequestParsers solrRequestParsers; /** Creates a default instance from the solrconfig.xml. */ public SolrConfig() @@ -324,6 +325,13 @@ } return result; } + + public SolrRequestParsers getRequestParsers() { + if (solrRequestParsers == null) { + solrRequestParsers = new SolrRequestParsers(this); + } + return solrRequestParsers; + } /* The set of materialized parameters: */ public final int booleanQueryMaxClauseCount; Index: solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java =================================================================== --- solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java (revision 1564426) +++ solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java (working copy) @@ -103,7 +103,6 @@ protected String pathPrefix = null; // strip this from the beginning of a path protected String abortErrorMessage = null; - protected final Map<SolrConfig, SolrRequestParsers> parsers = new WeakHashMap<SolrConfig, SolrRequestParsers>(); private static final Charset UTF8 = Charset.forName("UTF-8"); @@ -348,12 +347,7 @@ if( core != null ) { final SolrConfig config = core.getSolrConfig(); // get or create/cache the parser for the core - SolrRequestParsers parser = null; - parser = parsers.get(config); - if( parser == null ) { - parser = new SolrRequestParsers(config); - parsers.put(config, parser ); - } + SolrRequestParsers parser = config.getRequestParsers(); // Handle /schema/* paths via Restlet if( path.startsWith("/schema") ) {
        Hide
        ASF subversion and git services added a comment -

        Commit 1565069 from Mark Miller in branch 'dev/trunk'
        [ https://svn.apache.org/r1565069 ]

        SOLR-5691: Sharing non thread safe WeakHashMap across thread can cause problems.

        Show
        ASF subversion and git services added a comment - Commit 1565069 from Mark Miller in branch 'dev/trunk' [ https://svn.apache.org/r1565069 ] SOLR-5691 : Sharing non thread safe WeakHashMap across thread can cause problems.
        Hide
        ASF subversion and git services added a comment -

        Commit 1565070 from Mark Miller in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1565070 ]

        SOLR-5691: Sharing non thread safe WeakHashMap across thread can cause problems.

        Show
        ASF subversion and git services added a comment - Commit 1565070 from Mark Miller in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1565070 ] SOLR-5691 : Sharing non thread safe WeakHashMap across thread can cause problems.
        Hide
        Mark Miller added a comment -

        Die WeakHashMap, die!

        Thanks Bojan!

        Show
        Mark Miller added a comment - Die WeakHashMap, die! Thanks Bojan!
        Hide
        Bojan Smid added a comment -

        Thanks for fixing!

        Show
        Bojan Smid added a comment - Thanks for fixing!

          People

          • Assignee:
            Mark Miller
            Reporter:
            Bojan Smid
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development