Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-8274

In pseudo or cluster model under Cygwin, tasktracker can not create a new job because of symlink problem.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.20.205.0, 1.0.0, 1.0.1, 0.22.0
    • None
    • None
    • None
    • windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0

    Description

      The standalone model is ok. But, in pseudo or cluster model, it always throw errors, even I just run wordcount example.

      The HDFS works fine, but tasktracker can not create threads(jvm) for new job. It is empty under /logs/userlogs/job-xxxx/attempt-xxxx/.

      The reason looks like that in windows, Java can not recognize a symlink of folder as a folder.

      The detail description is as following,

      ======================================================================================================

      First, the error log of tasktracker is like:

      ======================
      12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201203280212_0005_m_-1386636958
      12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner jvm_201203280212_0005_m_-1386636958 spawned.
      12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed jvm_201203280212_0005_m_-1386636958 but just removed
      12/03/28 14:35:17 INFO mapred.JvmManager: JVM : jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks it ran: 0
      12/03/28 14:35:17 WARN mapred.TaskRunner: attempt_201203280212_0005_m_000002_0 : Child Error
      java.io.IOException: Task process exit with nonzero status of -1.
      at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
      12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 2
      12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201203280212_0005_m_000002_1 task's state:UNASSIGNED
      12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : attempt_201203280212_0005_m_000002_1 which needs 1 slots
      12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free slots : 2 and trying to launch attempt_201203280212_0005_m_000002_1 which needs 1 slots
      12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203280212_0005_m_000002_0
      java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified)
      at java.io.FileInputStream.open(Native Method)
      at java.io.FileInputStream.<init>(FileInputStream.java:120)
      at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
      at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
      at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
      at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
      at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
      at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
      at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
      at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
      at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
      at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
      at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
      at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
      at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
      at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
      at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
      at org.mortbay.jetty.Server.handle(Server.java:326)
      at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
      at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
      at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
      at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
      at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
      at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
      12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stderr log for task: attempt_201203280212_0005_m_000002_0
      java.io.FileNotFoundException: D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_0\log.index (The system cannot find the path specified)
      at java.io.FileInputStream.open(Native Method)
      at java.io.FileInputStream.<init>(FileInputStream.java:120)
      at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
      at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
      at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:423)
      at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
      at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
      at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
      at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
      at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
      at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
      at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
      at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
      at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
      at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
      at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
      at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
      at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
      at org.mortbay.jetty.Server.handle(Server.java:326)
      at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
      at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
      at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
      at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
      at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
      at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

      =======================================

      I've tried to remote debug tasktracker. In

      org.apache.hadoop.mapredTaskLog.createTaskAttemptLogDir(TaskAttemptID, boolean, String[]) line: 97:
      public static void createTaskAttemptLogDir(TaskAttemptID taskID,
      boolean isCleanup, String[] localDirs) throws IOException{
      String cleanupSuffix = isCleanup ? ".cleanup" : "";
      String strAttemptLogDir = getTaskAttemptLogDir(taskID,
      cleanupSuffix, localDirs);
      File attemptLogDir = new File(strAttemptLogDir);
      if (!attemptLogDir.mkdirs())

      { throw new IOException("Creation of " + attemptLogDir + " failed."); }

      String strLinkAttemptLogDir =
      getJobDir(taskID.getJobID()).getAbsolutePath() + File.separatorChar +
      taskID.toString() + cleanupSuffix;
      if (FileUtil.symLink(strAttemptLogDir, strLinkAttemptLogDir) != 0)

      { throw new IOException("Creation of symlink from " + strLinkAttemptLogDir + " to " + yestrAttemptLogDir + " failed."); }

      //Set permissions for target attempt log dir
      FsPermission userOnly = new FsPermission((short) 0777); //FsPermission userOnly = new FsPermission((short) 0700);
      FileUtil.setPermission(attemptLogDir, userOnly);
      }
      and symlink() function
      public static int symLink(String target, String linkname) throws IOException{
      String cmd = "ln -s " + target + " " + linkname;
      Process p = Runtime.getRuntime().exec(cmd, null);
      int returnVal = -1;
      try

      { returnVal = p.waitFor(); }

      catch(InterruptedException e)

      { //do nothing as of yet }

      if (returnVal != 0)

      { LOG.warn("Command '" + cmd + "' failed " + returnVal + " with: " + copyStderr(p)); }

      return returnVal;
      }

      we know hadoop will create a log folder in ${hadoop.tmp.dir}, and then invoke "ln -s " to create its symlink under /logs/userlog/job-xxx/attermp-xxxx.

      In my case,
      strLinkAttemptLogDir = D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1
      strAttemptLogDir=/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1

      After a subtrack is created by tasktracker, it runs error in the following function:

      in org.apache.hadoop.mapred.java , DefaultTaskController.launchTask(String, String, String, List<String>, List<String>, File, String, String) line: 107
      ...............
      //mkdir the loglocation
      String logLocation = TaskLog.getAttemptDir(jobId, attemptId).toString();
      if (!localFs.mkdirs(new Path(logLocation)))

      { throw new IOException("Mkdirs failed to create " + logLocation); }

      ..............

      mkdir() return false, because logLocation is a symlink file. In my case, it is ogLocation=D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1. If I open it from explorer in windows, it is just a file, but not a folder or shortcut. And its content is like,
      <symlink>/tmp/hadoop-timwu/mapred/local\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000002_1

      Because the mkdir() is
      public boolean mkdirs(Path f) throws IOException

      { Path parent = f.getParent(); File p2f = pathToFile(f); return (parent == null || mkdirs(parent)) && (p2f.mkdir() || p2f.isDirectory()); }

      So, p2f.isDirectory returns false. And p2f.isFile will return true. So, for java, it is a file. Hence, IOException("Mkdirs failed to create D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_000001_1")
      will be throws in child threads, and return -1. Then, we will get the above exception in main thread.

      Is it any way to close this symlink? Or any other way I can follow?

      BTW, in core-site.xml, I set hadoop.tmp.dir = /tmp/hadoop-${user.name}, and my $User.name is timwu. So, it should create a tmp folder /tmp/hadoop-timwu under cygwin's. However, in deed , it create a folder of d:/tmp/hadoop-timwu. If in cygwin, it is /cygdriver/d/tmp/hadoop-timwu. Is it correct?

      Attachments

        Activity

          People

            lsattr haoxiaohui
            chinatimwu tim.wu
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: