Log4j 2
  1. Log4j 2
  2. LOG4J2-485

toString methods that perform logging can deadlock AsyncAppender

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-beta9
    • Fix Version/s: 2.0-rc1, 2.0
    • Component/s: Appenders
    • Labels:
      None

      Description

      During analysis of LOG4J2-471, I found the same issue can happen with AsyncAppender. Summary of the problem:

      Application thread A calls Logger.log(), which puts LogEvents in a queue. Appender thread B takes LogEvents off the queue, converts them to a byte[] array and saves this byte array to disk. The conversion from a LogEvent to a byte[] array may itself result in calls to Logger.log() (particularly if logging is done from the toString() method of an object that is being logged).
      Here is where the problem occurs: if the Appender thread calls Logger.log() when processing LogEvents from the queue, and it happens that the queue is full, then in blocking mode, the Appender thread will block until space is available in the queue. Unfortunately space will only become available if the Appender consumes LogEvents, but the Appender thread is blocked... resulting in deadlock.

      It is possible to configure AsyncAppender with blocking="false", which will log to an errorAppender when the queue is full, but a better solution would be to prevent deadlock in the above scenario.

      The problem can be reproduced with this program:

      package demo.deadlock;
      import org.apache.logging.log4j.LogManager;
      import org.apache.logging.log4j.Logger;
      
      public class DeadlockDemo {
          static Logger sLog = LogManager.getLogger(DeadlockDemo.class);
      
          public static void main(String[] args) throws Exception {
              try {
                  System.out.println("(Sysout) Before throwing ex...");
                  sLog.info("(Log) Before throwing ex...");
                  Thread.sleep(1000);
      
                  throw new LoggingEx();
              } catch (Exception ex) {
                  sLog.warn("Caught a {} problem", ex, ex);
                  sLog.error(ex, ex);
              }
              System.out.println("(Sysout) After catching ex");
              sLog.info("(Log) After catching ex");
              
              Thread.sleep(1000);
              System.out.println("(Sysout) After 1 sec..."); // I still see this
              sLog.info("(Log) After 1 sec..."); // causes deadlock on my laptop
              
              Thread.sleep(1000);
              System.out.println("(Sysout) After 2 sec..."); // I don't see this output
              sLog.info("(Log) After 2 sec...");
              sLog.info("Done. Completed without deadlock.");
          }
          
          public static class LoggingEx extends Exception {
              static final Logger exLog = LogManager.getLogger(LoggingEx.class);
              
              @Override
              public String toString() {
                  System.out.println("(Sysout) Before LoggingEx calling logger.debug 300 times...");
      
                  // fill up AsyncAppender queue (128 slots)
                  for(int i = 0; i < 300; i++) {
                      exLog.debug("(Log) in LoggingEx.toString() {}...", i);
                  }
                  System.out.println("(Sysout) After LoggingEx called logger.debug 300 times...");
                  try {
                      Thread.sleep(1000);
                  } catch (InterruptedException e) {
                      e.printStackTrace();
                  }
                  return "oops";
              }
          }
      }
      

      ... and this log4j2.xml config:

      <?xml version="1.0" encoding="UTF-8"?>
      <configuration status="TRACE" name="deadlock-demo" packages="">
        <appenders>
            <Console name="STDOUT" target="SYSTEM_OUT">
              <PatternLayout pattern="%m%n"/>
            </Console>
          <Async name="Async" bufferSize="128" errorRef="STDOUT">
            <appenderRef ref="STDOUT" />
          </Async>
        </appenders>
        <loggers>
          <root level="DEBUG">
            <appenderRef ref="Async" />
          </root>
        </loggers>
      </configuration>
      

      Output:

      (Sysout) Before throwing ex...
      (Log) Before throwing ex...
      (Sysout) Before LoggingEx calling logger.debug 300 times...
      (Log) in LoggingEx.toString() 0...
      (Log) in LoggingEx.toString() 1...
      (Log) in LoggingEx.toString() 2...
      ...(omitted)
      (Log) in LoggingEx.toString() 169...
      (Log) in LoggingEx.toString() 170...
      (Sysout) After LoggingEx called logger.debug 300 times...
      (Log) in LoggingEx.toString() 171...
      (Log) in LoggingEx.toString() 172...
      ...(omitted)
      (Log) in LoggingEx.toString() 298...
      (Log) in LoggingEx.toString() 299...
      Caught a oops problem
      demo.deadlock.DeadlockDemo$LoggingEx
      	at demo.deadlock.DeadlockDemo.main(DeadlockDemo.java:32) [bin/:?]
      (Sysout) After catching ex
      (Sysout) Before LoggingEx calling logger.debug 300 times...
      (Sysout) After 1 sec...
      

      ... and the application hangs.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          2h 8m 1 Remko Popma 05/Jan/14 08:16
          Resolved Resolved Closed Closed
          2m 39s 1 Remko Popma 05/Jan/14 08:19
          Remko Popma made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Remko Popma made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Remko Popma added a comment -

          I applied the same idea as in LOG4J2-471 here:
          Use a threadlocal variable to detect whether we're about to add a LogEvent to the queue from the Appender thread. If that is the case and the queue is full, then bypass the queue and do the Appender thread work directly (callAppenders).

          As with LOG4J2-471, I have been unable to create JUnit tests for the deadlock scenario, and instead I used the DeadLockDemo class in the description above to verify the solution. Without the patch the demo class reliably deadlocks, with the patch there is no deadlock.

          Fixed in revision 1555468.

          Show
          Remko Popma added a comment - I applied the same idea as in LOG4J2-471 here: Use a threadlocal variable to detect whether we're about to add a LogEvent to the queue from the Appender thread. If that is the case and the queue is full, then bypass the queue and do the Appender thread work directly (callAppenders). As with LOG4J2-471 , I have been unable to create JUnit tests for the deadlock scenario, and instead I used the DeadLockDemo class in the description above to verify the solution. Without the patch the demo class reliably deadlocks, with the patch there is no deadlock. Fixed in revision 1555468.
          Remko Popma made changes -
          Description Similar to LOG4J2-471, but for AsyncAppender.
          See DeadlockDemo class in LOG4J2-471 comments.
          During analysis of LOG4J2-471, I found the same issue can happen with AsyncAppender. Summary of the problem:

          Application thread A calls {{Logger.log()}}, which puts LogEvents in a queue. Appender thread B takes LogEvents off the queue, converts them to a byte[] array and saves this byte array to disk. The conversion from a LogEvent to a byte[] array may itself result in calls to {{Logger.log()}} (particularly if logging is done from the {{toString()}} method of an object that is being logged).
          Here is where the problem occurs: if the Appender thread calls {{Logger.log()}} when processing LogEvents from the queue, and it happens that the queue is full, then in blocking mode, the Appender thread will block until space is available in the queue. Unfortunately space will only become available if the Appender consumes LogEvents, but the Appender thread is blocked... resulting in deadlock.

          It is possible to configure {{AsyncAppender}} with {{blocking="false"}}, which will log to an errorAppender when the queue is full, but a better solution would be to prevent deadlock in the above scenario.

          The problem can be reproduced with this program:
          {code}
          package demo.deadlock;
          import org.apache.logging.log4j.LogManager;
          import org.apache.logging.log4j.Logger;

          public class DeadlockDemo {
              static Logger sLog = LogManager.getLogger(DeadlockDemo.class);

              public static void main(String[] args) throws Exception {
                  try {
                      System.out.println("(Sysout) Before throwing ex...");
                      sLog.info("(Log) Before throwing ex...");
                      Thread.sleep(1000);

                      throw new LoggingEx();
                  } catch (Exception ex) {
                      sLog.warn("Caught a {} problem", ex, ex);
                      sLog.error(ex, ex);
                  }
                  System.out.println("(Sysout) After catching ex");
                  sLog.info("(Log) After catching ex");
                  
                  Thread.sleep(1000);
                  System.out.println("(Sysout) After 1 sec..."); // I still see this
                  sLog.info("(Log) After 1 sec..."); // causes deadlock on my laptop
                  
                  Thread.sleep(1000);
                  System.out.println("(Sysout) After 2 sec..."); // I don't see this output
                  sLog.info("(Log) After 2 sec...");
                  sLog.info("Done. Completed without deadlock.");
              }
              
              public static class LoggingEx extends Exception {
                  static final Logger exLog = LogManager.getLogger(LoggingEx.class);
                  
                  @Override
                  public String toString() {
                      System.out.println("(Sysout) Before LoggingEx calling logger.debug 300 times...");

                      // fill up AsyncAppender queue (128 slots)
                      for(int i = 0; i < 300; i++) {
                          exLog.debug("(Log) in LoggingEx.toString() {}...", i);
                      }
                      System.out.println("(Sysout) After LoggingEx called logger.debug 300 times...");
                      try {
                          Thread.sleep(1000);
                      } catch (InterruptedException e) {
                          e.printStackTrace();
                      }
                      return "oops";
                  }
              }
          }
          {code}
          ... and this log4j2.xml config:
          {code}
          <?xml version="1.0" encoding="UTF-8"?>
          <configuration status="TRACE" name="deadlock-demo" packages="">
            <appenders>
                <Console name="STDOUT" target="SYSTEM_OUT">
                  <PatternLayout pattern="%m%n"/>
                </Console>
              <Async name="Async" bufferSize="128" errorRef="STDOUT">
                <appenderRef ref="STDOUT" />
              </Async>
            </appenders>
            <loggers>
              <root level="DEBUG">
                <appenderRef ref="Async" />
              </root>
            </loggers>
          </configuration>
          {code}
          Output:
          {code}
          (Sysout) Before throwing ex...
          (Log) Before throwing ex...
          (Sysout) Before LoggingEx calling logger.debug 300 times...
          (Log) in LoggingEx.toString() 0...
          (Log) in LoggingEx.toString() 1...
          (Log) in LoggingEx.toString() 2...
          ...(omitted)
          (Log) in LoggingEx.toString() 169...
          (Log) in LoggingEx.toString() 170...
          (Sysout) After LoggingEx called logger.debug 300 times...
          (Log) in LoggingEx.toString() 171...
          (Log) in LoggingEx.toString() 172...
          ...(omitted)
          (Log) in LoggingEx.toString() 298...
          (Log) in LoggingEx.toString() 299...
          Caught a oops problem
          demo.deadlock.DeadlockDemo$LoggingEx
          at demo.deadlock.DeadlockDemo.main(DeadlockDemo.java:32) [bin/:?]
          (Sysout) After catching ex
          (Sysout) Before LoggingEx calling logger.debug 300 times...
          (Sysout) After 1 sec...
          {code}
          ... and the application hangs.
          Remko Popma made changes -
          Field Original Value New Value
          Link This issue relates to LOG4J2-471 [ LOG4J2-471 ]
          Remko Popma created issue -

            People

            • Assignee:
              Remko Popma
              Reporter:
              Remko Popma
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development