Uploaded image for project: 'ServiceMix'
  1. ServiceMix
  2. SM-1858

Deadlock On Component Uninstall (using Seda Flow)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.3
    • None
    • servicemix-core
    • None
    • SunOS 5.10 Generic_137111-06 sun4u sparc SUNW,Sun-Fire-880
      Java(TM) SE Runtime Environment (build 1.6.0_01-b06)
      Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_01-b06, mixed mode)
      ServiceMix 3.3

    Description

      I've recently updated servicemix from 3.1 to 3.3 and have started seeing a deadlock condition when we attempt to reinstall a component on a running system. Not sure if this bug was present prior to 3.3 but I've never seen it in 3.1. I'll try to explain the scenario. There are two components involved A & B. Component A calls B and returns a response to its caller. The situation that causes the deadlock is if I try to do an uninstall component A (first step in a reinstall) while component A is waiting for a response from component B.

      The time line for the locks that cause this deadlock are:

      1. Component A send a synchronous request to component B through the SedaFlow. A read lock is established on the flow.
      2. Since the request is synchronous the Component A request thread waits on the MessageExchange object (to be notified when a response is ready).
      3. A reinstall of component A is triggered. The org.apache.servicemix.jbi.framework.InstallationService.unloadInstaller is called to first remove this component. The first thing thing the InstallationService does is to suspend the broker which in turn suspends the SedaFlow. Before the Seda Flow can be suspended a write lock must be acquired on the flow however this write lock cannot be acquired until the read lock from step 1. is released.
      4. Component B finishes its request and is now ready to return the response. Before it calls notify on the MessaeExchange lock in step 2. (allowing Component A to finish its request) it first must acquire a read lock on the SedaFlow lock. However it can't acquire this read lock because of the waiting write lock in step 3.
      5. Deadlock

      I'm not sure what the best way to fix this is but since I don't understand the interaction of ServiceMix's internals enough to dork with the synchronization I'm going to change the write lock attempt in the suspend() method of AbstractFlow to timeout after a couple of seconds. Something like:

          public synchronized void suspend() {
              if (log.isDebugEnabled()) {
                  log.debug("Called Flow suspend");
              }
              try
              {
                lock.writeLock().tryLock(10, TimeUnit.SECONDS);
              }
              catch (InterruptedException iexc)
              {
                throw new RuntimeException("Unable to suspend flow because write lock could not be acquired.");
              }
              suspendThread = Thread.currentThread();
          }
      
      

      I think this will work in my scenario because I'm only using one flow (seda). If multiple flows were being used however it would be possible for some to be suspended before this exception would be thrown and that could leave everything in a bad state (i.e. this is definitely a hack).

      I've also attached the stack traces from the time line above.

      Attachments

        1. stacktraces.txt
          5 kB
          Corey Baswell

        Activity

          People

            ffang Freeman Yue Fang
            cbaswell Corey Baswell
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: