Log4j 2
  1. Log4j 2
  2. LOG4J2-471

toString methods that perform logging can deadlock AsyncLogger

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0-beta9
    • Fix Version/s: 2.0-rc1, 2.0
    • Component/s: Core
    • Labels:
      None
    • Environment:

      Debian Squeeze
      Java 7u25

      Description

      The AsyncLogger thread is the only consumer of a Disruptor RingBuffer which contains log events ostensibly produced by a number of other threads. However, when AsyncLogger generates a log message, it has to call toString() on objects which can potentially also perform logging internally. This results in a situation when the RingBuffer is nearly full whereby AsyncLogger has pulled an event from the RingBuffer, calls toString() on the associated objects, which call AsyncLogger.log which attempts to publish an event into a full buffer, which will park repeatedly without hope of ever leaving the loop. Relevant stack trace is:
      "AsyncLogger-1" daemon prio=10 tid=0x00007f7a705a8000 nid=0x3d73 runnable [0x00007f7a740d1000]
      java.lang.Thread.State: TIMED_WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)
      at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349)
      at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:133)
      at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:102)
      at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:391)
      at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:219)
      at org.apache.logging.log4j.core.async.AsyncLogger.log(AsyncLogger.java:214)
      at org.apache.logging.log4j.spi.AbstractLoggerWrapper.log(AbstractLoggerWrapper.java:55)
      at org.slf4j.impl.SLF4JLogger.debug(SLF4JLogger.java:138)
      at org.somepackage.SomeException.toString (SomeException.java:32)
      at java.lang.String.valueOf(String.java:2854)
      at java.io.PrintWriter.println(PrintWriter.java:754)
      at java.lang.Throwable$WrappedPrintWriter.println(Throwable.java:763)
      at java.lang.Throwable.printStackTrace(Throwable.java:654)

      • locked <0x00000000cb2f6380> (a java.io.PrintWriter)
        at java.lang.Throwable.printStackTrace(Throwable.java:720)
        at org.apache.logging.log4j.core.pattern.ThrowablePatternConverter.formatOption(ThrowablePatternConverter.java:138)
        at org.apache.logging.log4j.core.pattern.ThrowablePatternConverter.format(ThrowablePatternConverter.java:80)
        at org.apache.logging.log4j.core.pattern.ExtendedThrowablePatternConverter.format(ExtendedThrowablePatternConverter.java:69)
        at org.apache.logging.log4j.core.pattern.PatternFormatter.format(PatternFormatter.java:36)
        at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:167)
        at org.apache.logging.log4j.core.layout.PatternLayout.toSerializable(PatternLayout.java:52)
        at org.apache.logging.log4j.core.layout.AbstractStringLayout.toByteArray(AbstractStringLayout.java:45)
        at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:111)
        at org.apache.logging.log4j.core.appender.RollingRandomAccessFileAppender.append(RollingRandomAccessFileAppender.java:96)
        at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:99)
        at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:425)
        at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:406)
        at org.apache.logging.log4j.core.Logger$PrivateConfig.logEvent(Logger.java:273)
        at org.apache.logging.log4j.core.async.AsyncLogger.actualAsyncLog(AsyncLogger.java:230)
        at org.apache.logging.log4j.core.async.RingBufferLogEvent.execute(RingBufferLogEvent.java:98)
        at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:43)
        at org.apache.logging.log4j.core.async.RingBufferLogEventHandler.onEvent(RingBufferLogEventHandler.java:28)
        at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:133)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
      1. LOG4J2-471.patch
        4 kB
        Anthony Baldocchi

        Issue Links

          Activity

          Anthony Baldocchi created issue -
          Remko Popma made changes -
          Field Original Value New Value
          Assignee Remko Popma [ remkop@yahoo.com ]
          Hide
          Remko Popma added a comment -

          Anthony, from the stacktrace, I agree with your analysis. I'm still thinking about how to solve this.
          Did you use the default ring buffer size?

          Btw, it looks like this issue can also happen with AsyncAppender. (See demo code below).
          The saving grace with AsyncAppender is that it can be configured to blocking="false", which will log to an errorAppender when the queue is full. AsyncLoggers currently don't have any such mechanism.

          package demo.deadlock;
          import org.apache.logging.log4j.LogManager;
          import org.apache.logging.log4j.Logger;
          
          public class DeadlockDemo {
              static Logger sLog = LogManager.getLogger(DeadlockDemo.class);
          
              public static void main(String[] args) throws Exception {
                  try {
                      System.out.println("(Sysout) Before throwing ex...");
                      sLog.info("(Log) Before throwing ex...");
                      Thread.sleep(1000);
          
                      throw new LoggingEx();
                  } catch (Exception ex) {
                      sLog.warn("Caught a {} problem", ex, ex);
                      sLog.error(ex, ex);
                  }
                  System.out.println("(Sysout) After catching ex");
                  sLog.info("(Log) After catching ex");
                  
                  Thread.sleep(1000);
                  System.out.println("(Sysout) After 1 sec..."); // I still see this
                  sLog.info("(Log) After 1 sec..."); // causes deadlock on my laptop
                  
                  Thread.sleep(1000);
                  System.out.println("(Sysout) After 2 sec..."); // I don't see this output
                  sLog.info("(Log) After 2 sec...");
              }
              
              public static class LoggingEx extends Exception {
                  static final Logger exLog = LogManager.getLogger(LoggingEx.class);
                  
                  @Override
                  public String toString() {
                      System.out.println("(Sysout) Before LoggingEx calling logger.debug 300 times...");
          
                      // fill up AsyncAppender queue (128 slots)
                      for(int i = 0; i < 300; i++) {
                          exLog.debug("(Log) in LoggingEx.toString() {}...", i);
                      }
                      System.out.println("(Sysout) After LoggingEx called logger.debug 300 times...");
                      try {
                          Thread.sleep(1000);
                      } catch (InterruptedException e) {
                          e.printStackTrace();
                      }
                      return "oops";
                  }
              }
          }
          

          log4j2.xml config:

          <?xml version="1.0" encoding="UTF-8"?>
          <configuration status="TRACE" name="deadlock-demo" packages="">
            <appenders>
                <Console name="STDOUT" target="SYSTEM_OUT">
                  <PatternLayout pattern="%m%n"/>
                </Console>
              <Async name="Async" bufferSize="128" errorRef="STDOUT">
                <appenderRef ref="STDOUT" />
              </Async>
            </appenders>
            <loggers>
              <root level="DEBUG">
                <appenderRef ref="Async" />
              </root>
            </loggers>
          </configuration>
          

          Output:

          (Sysout) Before throwing ex...
          (Log) Before throwing ex...
          (Sysout) Before LoggingEx calling logger.debug 300 times...
          (Log) in LoggingEx.toString() 0...
          (Log) in LoggingEx.toString() 1...
          (Log) in LoggingEx.toString() 2...
          ...(omitted)
          (Log) in LoggingEx.toString() 169...
          (Log) in LoggingEx.toString() 170...
          (Sysout) After LoggingEx called logger.debug 300 times...
          (Log) in LoggingEx.toString() 171...
          (Log) in LoggingEx.toString() 172...
          ...(omitted)
          (Log) in LoggingEx.toString() 298...
          (Log) in LoggingEx.toString() 299...
          Caught a oops problem
          demo.deadlock.DeadlockDemo$LoggingEx
          	at demo.deadlock.DeadlockDemo.main(DeadlockDemo.java:32) [bin/:?]
          (Sysout) After catching ex
          (Sysout) Before LoggingEx calling logger.debug 300 times...
          (Sysout) After 1 sec...
          

          ... and the application hangs.

          Show
          Remko Popma added a comment - Anthony, from the stacktrace, I agree with your analysis. I'm still thinking about how to solve this. Did you use the default ring buffer size? Btw, it looks like this issue can also happen with AsyncAppender . (See demo code below). The saving grace with AsyncAppender is that it can be configured to blocking="false" , which will log to an errorAppender when the queue is full. AsyncLoggers currently don't have any such mechanism. package demo.deadlock; import org.apache.logging.log4j.LogManager; import org.apache.logging.log4j.Logger; public class DeadlockDemo { static Logger sLog = LogManager.getLogger(DeadlockDemo.class); public static void main( String [] args) throws Exception { try { System .out.println( "(Sysout) Before throwing ex..." ); sLog.info( "(Log) Before throwing ex..." ); Thread .sleep(1000); throw new LoggingEx(); } catch (Exception ex) { sLog.warn( "Caught a {} problem" , ex, ex); sLog.error(ex, ex); } System .out.println( "(Sysout) After catching ex" ); sLog.info( "(Log) After catching ex" ); Thread .sleep(1000); System .out.println( "(Sysout) After 1 sec..." ); // I still see this sLog.info( "(Log) After 1 sec..." ); // causes deadlock on my laptop Thread .sleep(1000); System .out.println( "(Sysout) After 2 sec..." ); // I don't see this output sLog.info( "(Log) After 2 sec..." ); } public static class LoggingEx extends Exception { static final Logger exLog = LogManager.getLogger(LoggingEx.class); @Override public String toString() { System .out.println( "(Sysout) Before LoggingEx calling logger.debug 300 times..." ); // fill up AsyncAppender queue (128 slots) for ( int i = 0; i < 300; i++) { exLog.debug( "(Log) in LoggingEx.toString() {}..." , i); } System .out.println( "(Sysout) After LoggingEx called logger.debug 300 times..." ); try { Thread .sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } return "oops" ; } } } log4j2.xml config: <?xml version= "1.0" encoding= "UTF-8" ?> <configuration status= "TRACE" name= "deadlock-demo" packages=""> <appenders> <Console name= "STDOUT" target= "SYSTEM_OUT" > <PatternLayout pattern= "%m%n" /> </Console> <Async name= "Async" bufferSize= "128" errorRef= "STDOUT" > <appenderRef ref= "STDOUT" /> </Async> </appenders> <loggers> <root level= "DEBUG" > <appenderRef ref= "Async" /> </root> </loggers> </configuration> Output: (Sysout) Before throwing ex... (Log) Before throwing ex... (Sysout) Before LoggingEx calling logger.debug 300 times... (Log) in LoggingEx.toString() 0... (Log) in LoggingEx.toString() 1... (Log) in LoggingEx.toString() 2... ...(omitted) (Log) in LoggingEx.toString() 169... (Log) in LoggingEx.toString() 170... (Sysout) After LoggingEx called logger.debug 300 times... (Log) in LoggingEx.toString() 171... (Log) in LoggingEx.toString() 172... ...(omitted) (Log) in LoggingEx.toString() 298... (Log) in LoggingEx.toString() 299... Caught a oops problem demo.deadlock.DeadlockDemo$LoggingEx at demo.deadlock.DeadlockDemo.main(DeadlockDemo.java:32) [bin/:?] (Sysout) After catching ex (Sysout) Before LoggingEx calling logger.debug 300 times... (Sysout) After 1 sec... ... and the application hangs.
          Hide
          Anthony Baldocchi added a comment -

          Remko, in my current use case, the RingBuffer is configured to a smaller number of entries than the default (variable based on heap size and assuming a near-worst-case referenced object because asynchronous logging can potentially alter the expected lifecycle of objects if the logging thread is running behind; I was hitting OutOfMemoryErrors due to some large objects that couldn't be garbage collected until logging finished but new large objects could be generated because the thread that issued the log request was not blocked; 262144 large objects overflowed the heap).

          However, I think RingBuffer size only impacts the likelihood of encountering the issue (as the likelihood of hitting a full buffer decreases); as long as the buffer is filling more quickly than it is being emptied (not difficult to do with asynchronous logging), the buffer can reach fullness and, once full, this issue becomes significant. With a high logging rate, a single rollover event might be all that is needed to reach buffer fullness, no matter how large.

          The workaround I'm currently considering is a filter, akin to the BurstFilter, that makes use of the RingBufferAdmin introduced in 2.0-rc1 to reject logging events when the RingBuffer is full. I like the idea of the errorAppender you mention for AsyncAppender; I'm going to see if I can add similar fallback functionality as I'd like to avoid dropping logging events entirely if at all possible.

          An alternative solution I was considering was making use of a ThreadLocal to identify the AsyncLogger thread and then since I could determine if the thread was the AsyncLogger thread and that the RingBuffer was full, skip the publish event step and instead just call config.loggerconfig.log(...), passing the arguments on through. Patch for this second solution is attached.

          Show
          Anthony Baldocchi added a comment - Remko, in my current use case, the RingBuffer is configured to a smaller number of entries than the default (variable based on heap size and assuming a near-worst-case referenced object because asynchronous logging can potentially alter the expected lifecycle of objects if the logging thread is running behind; I was hitting OutOfMemoryErrors due to some large objects that couldn't be garbage collected until logging finished but new large objects could be generated because the thread that issued the log request was not blocked; 262144 large objects overflowed the heap). However, I think RingBuffer size only impacts the likelihood of encountering the issue (as the likelihood of hitting a full buffer decreases); as long as the buffer is filling more quickly than it is being emptied (not difficult to do with asynchronous logging), the buffer can reach fullness and, once full, this issue becomes significant. With a high logging rate, a single rollover event might be all that is needed to reach buffer fullness, no matter how large. The workaround I'm currently considering is a filter, akin to the BurstFilter, that makes use of the RingBufferAdmin introduced in 2.0-rc1 to reject logging events when the RingBuffer is full. I like the idea of the errorAppender you mention for AsyncAppender; I'm going to see if I can add similar fallback functionality as I'd like to avoid dropping logging events entirely if at all possible. An alternative solution I was considering was making use of a ThreadLocal to identify the AsyncLogger thread and then since I could determine if the thread was the AsyncLogger thread and that the RingBuffer was full, skip the publish event step and instead just call config.loggerconfig.log(...), passing the arguments on through. Patch for this second solution is attached.
          Hide
          Anthony Baldocchi added a comment -

          Attached patch that avoids blocking due to full RingBuffer on the AsyncLogger thread.

          Show
          Anthony Baldocchi added a comment - Attached patch that avoids blocking due to full RingBuffer on the AsyncLogger thread.
          Anthony Baldocchi made changes -
          Attachment LOG4J2-471.patch [ 12620074 ]
          Hide
          Remko Popma added a comment -

          Anthony,
          I've taken a quick look at the patch you provided. I understand what you are trying to do and it seems a good idea to me.
          Before your patch I was also thinking along lines of dropping events or using a fallback / errorAppender, but this solution seems much simpler. Nice!

          I want to play around with it a bit, and also see if this solution works for AsyncLoggerConfig (the use case where sync and async loggers are mixed, using AsyncLogger or AsyncRoot configuration elements).

          Show
          Remko Popma added a comment - Anthony, I've taken a quick look at the patch you provided. I understand what you are trying to do and it seems a good idea to me. Before your patch I was also thinking along lines of dropping events or using a fallback / errorAppender, but this solution seems much simpler. Nice! I want to play around with it a bit, and also see if this solution works for AsyncLoggerConfig (the use case where sync and async loggers are mixed, using AsyncLogger or AsyncRoot configuration elements).
          Remko Popma made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Remko Popma made changes -
          Fix Version/s 2.0 [ 12324757 ]
          Hide
          Remko Popma added a comment -

          Anthony, thanks again for the patch!
          I've applied it with some changes, and also applied a variant of the same idea to AsyncLoggerConfig (for mixed sync and async loggers).

          I have not been able to create JUnit tests for the deadlock scenario, but I've used the DeadLockDemo class in the comment above to verify the solution. Without the patch this class reliably deadlocks, with the patch there is no deadlock (both for AsyncLoggers and AsyncLoggerConfig).

          To test AsyncLoggers, run the above demo.deadlock.DeadlockDemo with system properties

          -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
          -DAsyncLogger.RingBufferSize=128
          

          and configuration

          <?xml version="1.0" encoding="UTF-8"?>
          <configuration status="TRACE" name="deadlock-demo" packages="">
            <appenders>
                <Console name="STDOUT" target="SYSTEM_OUT">
                  <PatternLayout pattern="%m%n"/>
                </Console>
            </appenders>
            <loggers>
              <root level="DEBUG">
                <appenderRef ref="STDOUT" />
              </root>
            </loggers>
          </configuration>
          

          To test AsyncLoggerConfig (mixed sync and async loggers), run the above demo.deadlock.DeadlockDemo with system properties

          -DAsyncLoggerConfig.RingBufferSize=128
          

          and configuration

          <?xml version="1.0" encoding="UTF-8"?>
          <configuration status="TRACE" name="deadlock-demo" packages="">
            <appenders>
                <Console name="STDOUT" target="SYSTEM_OUT">
                  <PatternLayout pattern="%m%n"/>
                </Console>
            </appenders>
            <loggers>
              <AsyncRoot level="DEBUG">
                <appenderRef ref="STDOUT" />
              </AsyncRoot>
            </loggers>
          </configuration>
          

          Fixed in revision 1555336.
          Please verify and close.

          Show
          Remko Popma added a comment - Anthony, thanks again for the patch! I've applied it with some changes, and also applied a variant of the same idea to AsyncLoggerConfig (for mixed sync and async loggers). I have not been able to create JUnit tests for the deadlock scenario, but I've used the DeadLockDemo class in the comment above to verify the solution. Without the patch this class reliably deadlocks, with the patch there is no deadlock (both for AsyncLoggers and AsyncLoggerConfig). To test AsyncLoggers , run the above demo.deadlock.DeadlockDemo with system properties -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector -DAsyncLogger.RingBufferSize=128 and configuration <?xml version= "1.0" encoding= "UTF-8" ?> <configuration status= "TRACE" name= "deadlock-demo" packages=""> <appenders> <Console name= "STDOUT" target= "SYSTEM_OUT" > <PatternLayout pattern= "%m%n" /> </Console> </appenders> <loggers> <root level= "DEBUG" > <appenderRef ref= "STDOUT" /> </root> </loggers> </configuration> To test AsyncLoggerConfig (mixed sync and async loggers), run the above demo.deadlock.DeadlockDemo with system properties -DAsyncLoggerConfig.RingBufferSize=128 and configuration <?xml version= "1.0" encoding= "UTF-8" ?> <configuration status= "TRACE" name= "deadlock-demo" packages=""> <appenders> <Console name= "STDOUT" target= "SYSTEM_OUT" > <PatternLayout pattern= "%m%n" /> </Console> </appenders> <loggers> <AsyncRoot level= "DEBUG" > <appenderRef ref= "STDOUT" /> </AsyncRoot> </loggers> </configuration> Fixed in revision 1555336. Please verify and close.
          Remko Popma made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Fix Version/s 2.0-rc1 [ 12325011 ]
          Resolution Fixed [ 1 ]
          Remko Popma made changes -
          Link This issue is related to LOG4J2-485 [ LOG4J2-485 ]
          Anthony Baldocchi made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open In Progress In Progress
          13d 14h 55m 1 Remko Popma 03/Jan/14 14:59
          In Progress In Progress Resolved Resolved
          18h 11m 1 Remko Popma 04/Jan/14 09:10
          Resolved Resolved Closed Closed
          2d 6h 57m 1 Anthony Baldocchi 06/Jan/14 16:07

            People

            • Assignee:
              Remko Popma
              Reporter:
              Anthony Baldocchi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development