Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2646

log4j initialization not quite compatible with log4j 2.x



    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.0.0, 1.0.1
    • Fix Version/s: 1.1.0
    • Component/s: Spark Core
    • Labels:


      The logging code that handles log4j initialization leads to an stack overflow error when used with log4j 2.x, which has just been released. This occurs even a downstream project has correctly adjusted SLF4J bindings, and that is the right thing to do for log4j 2.x, since it is effectively a separate project from 1.x.

      Here is the relevant bit of Logging.scala:

        private def initializeLogging() {
          // If Log4j is being used, but is not initialized, load a default properties file
          val binder = StaticLoggerBinder.getSingleton
          val usingLog4j = binder.getLoggerFactoryClassStr.endsWith("Log4jLoggerFactory")
          val log4jInitialized = LogManager.getRootLogger.getAllAppenders.hasMoreElements
          if (!log4jInitialized && usingLog4j) {
            val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
            Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
              case Some(url) =>
                log.info(s"Using Spark's default log4j profile: $defaultLogProps")
              case None =>
                System.err.println(s"Spark was unable to load $defaultLogProps")
          Logging.initialized = true
          // Force a call into slf4j to initialize it. Avoids this happening from mutliple threads
          // and triggering this: http://mailman.qos.ch/pipermail/slf4j-dev/2010-April/002956.html

      The first minor issue is that there is a call to a logger inside this method, which is initializing logging. In this situation, it ends up causing the initialization to be called recursively until the stack overflow. It would be slightly tidier to log this only after Logging.initialized = true. Or not at all. But it's not the root problem, or else, it would not work at all now.

      The calls to log4j classes here always reference log4j 1.2 no matter what. For example, there is not getAllAppenders in log4j 2.x. That's fine. Really, "usingLog4j" means "using log4j 1.2" and "log4jInitialized" means "log4j 1.2 is initialized".

      usingLog4j should be false for log4j 2.x, because the initialization only matters for log4j 1.2. But, it's true, and that's the real issue. And log4jInitialized is always false, since calls to the log4j 1.2 API are stubs and no-ops in this setup, where the caller has swapped in log4j 2.x. Hence the loop.

      This is fixed, I believe, if "usingLog4j" can be false for log4j 2.x. The SLF4J static binding class has the same name for both versions, unfortunately, which causes the issue. However they're in different packages. For example, if the test included "... and begins with org.slf4j", it should work, as the SLF4J binding for log4j 2.x is provided by log4j 2.x at the moment, and is in package org.apache.logging.slf4j.

      Of course, I assume that SLF4J will eventually offer its own binding. I hope to goodness they at least name the binding class differently, or else this will again not work. But then some other check can probably be made.

      (Credit to Agust Egilsson for finding this; at his request I'm opening a JIRA for him. I'll propose a PR too.)


          Issue Links



              • Assignee:
                srowen Sean R. Owen
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: