Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4003

log.index (No such file or directory) AND Task process exit with nonzero status of 126

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.205.0, 1.0.1
    • Fix Version/s: 1.0.3
    • Labels:
      None
    • Environment:
    • Tags:
      mapreduce tasktracker
    • Target Version/s:

      Description

      hello,I have dwelled on this hadoop(cdhu3) problem for 2 days,I have tried every google method.This is the issue: when ran hadoop example "wordcount" ,the tasktracker's log in one slave node presented such errors

      1.WARN org.apache.hadoop.mapred.DefaultTaskController: Task wrapper stderr: bash: /var/tmp/mapred/local/ttprivate/taskTracker/hdfs/jobcache/job_201203131751_0003/attempt_201203131751_0003_m_000006_0/taskjvm.sh: Permission denied

      2.WARN org.apache.hadoop.mapred.TaskRunner: attempt_201203131751_0003_m_000006_0 : Child Error java.io.IOException: Task process exit with nonzero status of 126.

      3.WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201203131751_0003_m_000003_0 java.io.FileNotFoundException: /usr/lib/hadoop-0.20/logs/userlogs/job_201203131751_0003/attempt_201203131751_0003_m_000003_0/log.index (No such file or directory)

      I could not find similar issues in google,just got some posts seem a little relevant ,which suggest: A. the ulimit of hadoop user---but my ulimit is set large enough for this bundled example;B. the memory used by jvm,but my jvm only use Xmx200m,too small to exceed the limit of my machine ;C.the privilege of the mapred.local.dir and logs dir--I set them by "chmod 777";D .the disk space is full---there are enough space for hadoop in my log directory and mapred.local.dir.

      Thanks for you all,I am really at my wit's end,I have spend days on it. I really appreciate any light!

      1. mapreduce-4003-1.patch
        2 kB
        Koji Noguchi
      2. MAPREDUCE-4003-branch1.patch
        2 kB
        Thomas Graves

        Activity

        Hide
        Koji Noguchi added a comment -

        This also happens on our 0.20.205 clusters when users' jvm crashes at start up due to invalid jvm param or JNI crashes. stderr/stdout files exist but log.index does not.

        WebUI shows

        HTTP ERROR 410
        
        Problem accessing /tasklog. Reason:
        
            Failed to retrieve stdout log for task: attempt_201202130706_50000_m_000004_0
        
        

        TaskTracker log shows

        2012-03-15 01:35:18,431 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201202130706_50000_m_000004_0
        java.io.FileNotFoundException: /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/log.index (No such file or directory)
                at java.io.FileInputStream.open(Native Method)
                at java.io.FileInputStream.<init>(FileInputStream.java:106)
                at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
                at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187)
                at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:422)
                at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
                at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:269)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
                at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
                at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
                at yjava.servlet.filter.BouncerFilter.doFilter(BouncerFilter.java:411)
                at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:64)
                at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
                at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
                at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
                at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
                at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
                at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
                at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
                at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
                at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
                at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
                at org.mortbay.jetty.Server.handle(Server.java:326)
                at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
                at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
                at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
                at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
                at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
                at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
                at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
        2012-03-15 01:35:18,431 WARN org.mortbay.log: /tasklog: java.io.IOException: Closed
        

        Stdout/Stderr still exist with information that would help the users.

        -bash-3.2$ ls -l /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/
        total 4
        -rw-r----- 1 knoguchi hadoop 137 Mar 14 22:43 stderr
        -rw-r----- 1 knoguchi hadoop   0 Mar 14 22:43 stdout
        -bash-3.2$ cat /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/stderr
        Invalid maximum heap size: -Xmx10g
        The specified size exceeds the maximum representable size.
        Could not create the Java virtual machine.
        -bash-3.2$ 
        

        I thought MAPREDUCE-2366 fixed this but I guess it didn't cover this case. Fixing this would save users' time as well as support(my) time.

        Show
        Koji Noguchi added a comment - This also happens on our 0.20.205 clusters when users' jvm crashes at start up due to invalid jvm param or JNI crashes. stderr/stdout files exist but log.index does not. WebUI shows HTTP ERROR 410 Problem accessing /tasklog. Reason: Failed to retrieve stdout log for task: attempt_201202130706_50000_m_000004_0 TaskTracker log shows 2012-03-15 01:35:18,431 WARN org.apache.hadoop.mapred.TaskLog: Failed to retrieve stdout log for task: attempt_201202130706_50000_m_000004_0 java.io.FileNotFoundException: /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/log.index (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:106) at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) at org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187) at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:422) at org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81) at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:269) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) at yjava.servlet.filter.BouncerFilter.doFilter(BouncerFilter.java:411) at com.yahoo.hadoop.HadoopBouncerFilter.doFilter(HadoopBouncerFilter.java:64) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 2012-03-15 01:35:18,431 WARN org.mortbay.log: /tasklog: java.io.IOException: Closed Stdout/Stderr still exist with information that would help the users. -bash-3.2$ ls -l /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/ total 4 -rw-r----- 1 knoguchi hadoop 137 Mar 14 22:43 stderr -rw-r----- 1 knoguchi hadoop 0 Mar 14 22:43 stdout -bash-3.2$ cat /grid/log/mapred/userlogs/job_201202130706_50000/attempt_201202130706_50000_m_000004_0/stderr Invalid maximum heap size: -Xmx10g The specified size exceeds the maximum representable size. Could not create the Java virtual machine. -bash-3.2$ I thought MAPREDUCE-2366 fixed this but I guess it didn't cover this case. Fixing this would save users' time as well as support(my) time.
        Hide
        Koji Noguchi added a comment -

        This would probably only work with jvm reuse disabled. When log.index does not exist, this patch would simply try to open the files directly by providing a fake index info.

        Show
        Koji Noguchi added a comment - This would probably only work with jvm reuse disabled. When log.index does not exist, this patch would simply try to open the files directly by providing a fake index info.
        Hide
        Thomas Graves added a comment -

        +1 looks good to me. Yes it isn't perfect if you have jvm reuse on, unfortunately there isn't a good way to see if jvm reuse is enabled in the TaskLog code, but this is a big improvement.

        In the case mentioned above with invalid jvm args the client now displays the message below and you can see the same on the web ui for the job setup task.

        12/04/10 19:25:55 INFO mapred.JobClient: Task Id : attempt_201204101923_0001_m_000001_1, Status : FAILED
        java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
        Caused by: java.io.IOException: Task process exit with nonzero status of 1.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)

        attempt_201204101923_0001_m_000001_1: Invalid maximum heap size: -Xmx10g
        attempt_201204101923_0001_m_000001_1: The specified size exceeds the maximum representable size.
        attempt_201204101923_0001_m_000001_1: Could not create the Java virtual machine.

        Show
        Thomas Graves added a comment - +1 looks good to me. Yes it isn't perfect if you have jvm reuse on, unfortunately there isn't a good way to see if jvm reuse is enabled in the TaskLog code, but this is a big improvement. In the case mentioned above with invalid jvm args the client now displays the message below and you can see the same on the web ui for the job setup task. 12/04/10 19:25:55 INFO mapred.JobClient: Task Id : attempt_201204101923_0001_m_000001_1, Status : FAILED java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by: java.io.IOException: Task process exit with nonzero status of 1. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258) attempt_201204101923_0001_m_000001_1: Invalid maximum heap size: -Xmx10g attempt_201204101923_0001_m_000001_1: The specified size exceeds the maximum representable size. attempt_201204101923_0001_m_000001_1: Could not create the Java virtual machine.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12518411/mapreduce-4003-1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2189//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518411/mapreduce-4003-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2189//console This message is automatically generated.
        Hide
        Thomas Graves added a comment -

        Looks like the current patch has a findbug warning:

        Method org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskAttemptID, boolean) concatenates strings using + in a loop
        Bug type SBSC_USE_STRINGBUFFER_CONCATENATION (click for details)
        In class org.apache.hadoop.mapred.TaskLog
        In method org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskAttemptID, boolean)
        At TaskLog.java:[line 200]

        Attaching new patch with findbug fixed by switching to use the StringBuffer.

        Show
        Thomas Graves added a comment - Looks like the current patch has a findbug warning: Method org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskAttemptID, boolean) concatenates strings using + in a loop Bug type SBSC_USE_STRINGBUFFER_CONCATENATION (click for details) In class org.apache.hadoop.mapred.TaskLog In method org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskAttemptID, boolean) At TaskLog.java: [line 200] Attaching new patch with findbug fixed by switching to use the StringBuffer.
        Hide
        Thomas Graves added a comment -

        fix findbug warning.

        Show
        Thomas Graves added a comment - fix findbug warning.
        Hide
        Thomas Graves added a comment -

        Thanks Koji!

        I've committed this to 1.1.0.

        Show
        Thomas Graves added a comment - Thanks Koji! I've committed this to 1.1.0.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522166/MAPREDUCE-4003-branch1.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2192//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522166/MAPREDUCE-4003-branch1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2192//console This message is automatically generated.
        Hide
        toughman added a comment -

        hello,but when i encouter this problem,there is no jvm memory error,and there was no stdout/stderr files in the folder......

        Show
        toughman added a comment - hello,but when i encouter this problem,there is no jvm memory error,and there was no stdout/stderr files in the folder......
        Hide
        Koji Noguchi added a comment -

        but when i encouter this problem,there is no jvm memory error,and there was no stdout/stderr files in the folder......

        Sorry toughman, I guess I jumped on the "3. log.index (No such file or directory)" that I was facing and didn't solve your original problem. If you're still having problem, let's discuss it on the mailing list(mapreduce-user@hadoop.apache.org) or please try cdh mailing list.

        Show
        Koji Noguchi added a comment - but when i encouter this problem,there is no jvm memory error,and there was no stdout/stderr files in the folder...... Sorry toughman, I guess I jumped on the "3. log.index (No such file or directory)" that I was facing and didn't solve your original problem. If you're still having problem, let's discuss it on the mailing list(mapreduce-user@hadoop.apache.org) or please try cdh mailing list.
        Hide
        Matt Foley added a comment -

        Merged to branch-1.0 for 1.0.3 at TGraves request.

        Show
        Matt Foley added a comment - Merged to branch-1.0 for 1.0.3 at TGraves request.
        Hide
        Matt Foley added a comment -

        Closed upon release of Hadoop-1.0.3.

        Show
        Matt Foley added a comment - Closed upon release of Hadoop-1.0.3.

          People

          • Assignee:
            Koji Noguchi
            Reporter:
            toughman
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development