Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13678

ZkStateReader.removeCollectionPropsWatcher can deadlock with concurrent zkCallback thread on props watcher

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      while investigating an (unrelated) test bug in CollectionPropsTest I discovered a deadlock situation that can occur when calling ZkStateReader.removeCollectionPropsWatcher() if a zkCallback thread tries to concurrently fire the watchers set on the collection props.

      ZkStateReader.removeCollectionPropsWatcher() is itself called when a CollectionPropsWatcher.onStateChanged() impl returns "true" – meaning that IIUC any usage of CollectionPropsWatcher could potentially result in this type of deadlock situation.

      "TEST-CollectionPropsTest.testReadWriteCached-seed#[D3C6921874D1CFEB]" #15 prio=5 os_prio=0 cpu=567.78ms elapsed=682.12s tid=0x00007
      fa5e8343800 nid=0x3f61 waiting for monitor entry  [0x00007fa62d222000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at org.apache.solr.common.cloud.ZkStateReader.lambda$removeCollectionPropsWatcher$20(ZkStateReader.java:2001)
              - waiting to lock <0x00000000e6207500> (a java.util.concurrent.ConcurrentHashMap)
              at org.apache.solr.common.cloud.ZkStateReader$$Lambda$617/0x00000001006c1840.apply(Unknown Source)
              at java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1932)
              - locked <0x00000000eb9156b8> (a java.util.concurrent.ConcurrentHashMap$Node)
              at org.apache.solr.common.cloud.ZkStateReader.removeCollectionPropsWatcher(ZkStateReader.java:1994)
              at org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:125)
      
      ...
      
      "zkCallback-88-thread-2" #213 prio=5 os_prio=0 cpu=14.06ms elapsed=672.65s tid=0x00007fa6041bf000 nid=0x402f waiting for monitor ent
      ry  [0x00007fa5b8f39000]
         java.lang.Thread.State: BLOCKED (on object monitor)
              at java.util.concurrent.ConcurrentHashMap.compute(java.base@11.0.3/ConcurrentHashMap.java:1923)
              - waiting to lock <0x00000000eb9156b8> (a java.util.concurrent.ConcurrentHashMap$Node)
              at org.apache.solr.common.cloud.ZkStateReader$PropsNotification.<init>(ZkStateReader.java:2262)
              at org.apache.solr.common.cloud.ZkStateReader.notifyPropsWatchers(ZkStateReader.java:2243)
              at org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.refreshAndWatch(ZkStateReader.java:1458)
              - locked <0x00000000e6207500> (a java.util.concurrent.ConcurrentHashMap)
              at org.apache.solr.common.cloud.ZkStateReader$PropsWatcher.process(ZkStateReader.java:1440)
              at org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor.lambda$process$1(SolrZkClient.java:838)
              at org.apache.solr.common.cloud.SolrZkClient$ProcessWatchWithExecutor$$Lambda$253/0x00000001004a4440.run(Unknown Source)
              at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11.0.3/Executors.java:515)
              at java.util.concurrent.FutureTask.run(java.base@11.0.3/FutureTask.java:264)
              at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
              at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$140/0x0000000100308c40.run(Unknown Source)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.3/ThreadPoolExecutor.java:1128)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.3/ThreadPoolExecutor.java:628)
              at java.lang.Thread.run(java.base@11.0.3/Thread.java:834)
      
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            hossman Chris M. Hostetter

            Dates

              Created:
              Updated:

              Slack

                Issue deployment