Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2536

Hadoop's cleanup of local directory in uber mode causing failures

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 4.3.0
    • None
    • None

    Description

      In out environment, we faced an issue where uberized Shell action was getting stuck even though the shell action got completed with status 0. Please refer the attached syslog and stdout if launcher job, here I point out partially

      stdout :

      >>> Invoking Shell command line now >>
      Stdoutput myshellType=qmyshellUpdate
      Exit code of the Shell command 0
      <<< Invocation of Shell command completed <<<
      <<< Invocation of Main class completed <<<

      syslog

      2016-05-23 11:15:52,587 WARN [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Unable to delete unexpected local file/dir .action.xml.crc: insufficient permissions?
      2016-05-23 11:15:52,588 FATAL [AsyncDispatcher event handler] org.apache.hadoop.conf.Configuration: error parsing conf propagation-conf.xml
      java.io.FileNotFoundException: /tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_000001/propagation-conf.xml (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at java.io.FileInputStream.<init>(FileInputStream.java:93)
      at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
      at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
      at java.net.URL.openStream(URL.java:1038)
      at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
      at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
      at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
      at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)
      at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031)
      at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1251)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:568)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.updateMillisCounters(TaskAttemptImpl.java:1295)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createJobCounterUpdateEventTASucceeded(TaskAttemptImpl.java:1323)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.access$3500(TaskAttemptImpl.java:147)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1710)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1701)
      at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
      at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
      at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
      at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1085)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146)
      at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1394)
      at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1386)
      at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
      at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
      at java.lang.Thread.run(Thread.java:745)
      2016-05-23 11:15:52,590 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
      java.lang.RuntimeException: java.io.FileNotFoundException: /grid/5/tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_000001/propagation-conf.xml (No such file or directory)
      at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2639)
      at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
      at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
      at org.apache.hadoop.conf.Configuration.get(Configuration.java:981)
      at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1031)
      at org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1251)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.getMemoryRequired(TaskAttemptImpl.java:568)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.updateMillisCounters(TaskAttemptImpl.java:1295)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createJobCounterUpdateEventTASucceeded(TaskAttemptImpl.java:1323)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.access$3500(TaskAttemptImpl.java:147)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1710)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$SucceededTransition.transition(TaskAttemptImpl.java:1701)
      at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
      at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
      at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
      at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1085)
      at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:146)
      at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1394)
      at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1386)
      at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
      at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.FileNotFoundException: /tmp/yarn-local/usercache/saley/appcache/application_1234_123/container_e01_1234_123_01_000001/propagation-conf.xml (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at java.io.FileInputStream.<init>(FileInputStream.java:93)
      at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
      at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
      at java.net.URL.openStream(URL.java:1038)
      at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
      at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
      ... 22 more
      2016-05-23 11:15:52,591 INFO [AsyncDispatcher ShutDown handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
      2016-05-23 11:15:52,591 ERROR [AsyncDispatcher ShutDown handler] org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[AsyncDispatcher ShutDown handler,5,main] threw an Exception.
      java.lang.SecurityException: Intercepted System.exit(-1)
      at org.apache.oozie.action.hadoop.LauncherSecurityManager.checkExit(LauncherMapper.java:637)
      at java.lang.Runtime.exit(Runtime.java:107)
      at java.lang.System.exit(System.java:971)
      at org.apache.hadoop.yarn.event.AsyncDispatcher$2.run(AsyncDispatcher.java:294)
      at java.lang.Thread.run(Thread.java:745)
      2016-05-23 11:16:44,589 WARN [LeaseRenewer:saley@namenode.com:8020] org.apache.hadoop.conf.Configuration: job.xml:an attempt to override final parameter: hadoop.tmp.dir; Ignoring.
      2016-05-23 11:20:53,677 INFO Socket Reader #2 for port 50500 SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for saley (auth:SIMPLE)

      Attachments

        1. OOZIE-2536-1.patch
          1 kB
          Satish Saley

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            satishsaley Satish Saley
            satishsaley Satish Saley
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment