Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-406

Tasks launched by tasktracker in separate JVM can't generate log output

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.4.0
    • None
    • None
    • None

    Description

      Child JVM's don't have access to logging config system properties. When the child JVM gets launched, it doesn't inherit the Java system properties hadoop.log.dir and hadoop.log.file (which are actually based on the Bash environment variables $HADOOP_LOG_DIR and $HADOOP_LOGFILE). This means that you get no log messages from the actual map/reduce tasks that are executing.

      Stefan Groschupf reported this problem a while back:

      -------------------------------------------------------------------------
      To: hadoop-dev@lucene.apache.org
      From: Stefan Groschupf <sg@media-style.com>
      Subject: tasks can't log bug?
      Date: Tue, 25 Jul 2006 19:26:17 -0700
      X-Virus-Checked: Checked by ClamAV on apache.org

      Hi Hadoop developers,

      I'm confused about the way logging works within map or reduce tasks.
      Since tasks are launched in a new JVM the java system properties "hadoop.log.dir" and "hadoop.log.file" are not passed to the new JVM.
      This prevents the child process from logging properly. In fact you get:

      java.io.FileNotFoundException: / (Is a directory)
      at java.io.FileOutputStream.openAppend(Native Method)
      at java.io.FileOutputStream.<init>(FileOutputStream.java:177)
      at java.io.FileOutputStream.<init>(FileOutputStream.java:102)
      at org.apache.log4j.FileAppender.setFile(FileAppender.java:289)
      at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:165)
      at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:163)
      at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:256)
      at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:132)
      at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:96)
      at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:654)
      at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:612)
      at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.j
      2006-07-25 15:59:07,553 INFO mapred.TaskTracker (TaskTracker.java:main(993)) - Child
      at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:415)
      at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:441)
      at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:4
      at org.apache.log4j.LogManager.<clinit>(LogManager.java:122)
      at org.apache.log4j.Logger.getLogger(Logger.java:104)
      at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:229)
      at org.apache.commons.logging.impl.Log4JLogger.<init>(Log4JLogger.java:65)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImp
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcc
      at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
      at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:529
      at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:235
      at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:370)
      at org.apache.hadoop.mapred.TaskTracker.<clinit>(TaskTracker.java:44)
      at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:993)

      We see several ways to solve this problem. First retrieve the properties "hadoop.log.dir" and "hadoop.log.file" from the mother JVM and then pass them to the child JVM as within the args parameter.
      Second would be to access the environment variables "$HADOOP_LOG_DIR" and "$HADOOP_LOGFILE" using System.getEnv (java 1.5).
      Third there would be a more general solution. Taskrunner would resolve any environment variables it found in "mapred.child.java.opts" by lookup the value using System.getEnv().
      Eg:
      unix:
      export MAX_MEMORY = 200
      hadoop-site.xml:
      <name>mapred.child.java.opts</name>
      <value>-Xmx${MAX_MEMORY}</value>

      Attachments

        1. HADOOP-406.patch
          4 kB
          Chris Schneider

        Activity

          People

            Unassigned Unassigned
            schmed Chris Schneider
            Votes:
            4 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: