[SPARK-2646] log4j initialization not quite compatible with log4j 2.x - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 1.0.0, 1.0.1
Fix Version/s: 1.1.0
Component/s: Spark Core
Labels:
None

Description

The logging code that handles log4j initialization leads to an stack overflow error when used with log4j 2.x, which has just been released. This occurs even a downstream project has correctly adjusted SLF4J bindings, and that is the right thing to do for log4j 2.x, since it is effectively a separate project from 1.x.

Here is the relevant bit of Logging.scala:

  private def initializeLogging() {
    // If Log4j is being used, but is not initialized, load a default properties file
    val binder = StaticLoggerBinder.getSingleton
    val usingLog4j = binder.getLoggerFactoryClassStr.endsWith("Log4jLoggerFactory")
    val log4jInitialized = LogManager.getRootLogger.getAllAppenders.hasMoreElements
    if (!log4jInitialized && usingLog4j) {
      val defaultLogProps = "org/apache/spark/log4j-defaults.properties"
      Option(Utils.getSparkClassLoader.getResource(defaultLogProps)) match {
        case Some(url) =>
          PropertyConfigurator.configure(url)
          log.info(s"Using Spark's default log4j profile: $defaultLogProps")
        case None =>
          System.err.println(s"Spark was unable to load $defaultLogProps")
      }
    }
    Logging.initialized = true

    // Force a call into slf4j to initialize it. Avoids this happening from mutliple threads
    // and triggering this: http://mailman.qos.ch/pipermail/slf4j-dev/2010-April/002956.html
    log
  }

The first minor issue is that there is a call to a logger inside this method, which is initializing logging. In this situation, it ends up causing the initialization to be called recursively until the stack overflow. It would be slightly tidier to log this only after Logging.initialized = true. Or not at all. But it's not the root problem, or else, it would not work at all now.

The calls to log4j classes here always reference log4j 1.2 no matter what. For example, there is not getAllAppenders in log4j 2.x. That's fine. Really, "usingLog4j" means "using log4j 1.2" and "log4jInitialized" means "log4j 1.2 is initialized".

usingLog4j should be false for log4j 2.x, because the initialization only matters for log4j 1.2. But, it's true, and that's the real issue. And log4jInitialized is always false, since calls to the log4j 1.2 API are stubs and no-ops in this setup, where the caller has swapped in log4j 2.x. Hence the loop.

This is fixed, I believe, if "usingLog4j" can be false for log4j 2.x. The SLF4J static binding class has the same name for both versions, unfortunately, which causes the issue. However they're in different packages. For example, if the test included "... and begins with org.slf4j", it should work, as the SLF4J binding for log4j 2.x is provided by log4j 2.x at the moment, and is in package org.apache.logging.slf4j.

Of course, I assume that SLF4J will eventually offer its own binding. I hope to goodness they at least name the binding class differently, or else this will again not work. But then some other check can probably be made.

(Credit to Agust Egilsson for finding this; at his request I'm opening a JIRA for him. I'll propose a PR too.)

Attachments

Issue Links

relates to

SPARK-6305 Add support for log4j 2.x to Spark

Resolved

links to

[Github] Pull Request #1547 (srowen)

Activity

People

Assignee:: Unassigned

Reporter:: Sean R. Owen

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Jul/14 13:05

Updated:: 12/Mar/15 13:11

Resolved:: 31/Jul/14 19:27