[HADOOP-1179] task Tracker should be restarted if its jetty http server cannot serve get-map-output files - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.12.3
Component/s: None
Labels:
None

Description

Due to some errors (mem leak?), the jetty http server throws outOfMemory exception when serving get-map-output requests:

2007-03-28 20:42:39,608 WARN org.mortbay.jetty.servlet.ServletHandler: Error for
/mapOutput?map=task_0334_m_013127_0&reduce=591
2007-03-28 20:46:42,788 WARN org.mortbay.jetty.servlet.ServletHandler: Error for
/mapOutput?map=task_0334_m_013127_0&reduce=591
2007-03-28 20:49:38,064 WARN org.mortbay.jetty.servlet.ServletHandler: Error for
java.lang.OutOfMemoryError
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:199)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.read(R
awLocalFileSystem.java:119)
at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInput
Stream.java:41)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.DataInputStream.read(DataInputStream.java:132)
at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(Che
cksumFileSystem.java:182)
at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumF
ileSystem.java:167)
at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInput
Stream.java:41)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTrack
er.java:1643)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427
)
at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicati
onHandler.java:475)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:5
67)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplication
Context.java:635)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:
244)
at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

In this case, the task tracker cannot send out the map outut files on that machine, rendering it useless.
Moreover, all the reduces depending on those map output files are just stuck there.
If the task tracker reports fail to the job tracker, the map/reduce job can recover.
If the task tracker restarted, it can continue to join the cluster as a new mamber.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

1179.patch
29/Mar/07 14:26
3 kB
Devaraj Das

Activity

People

Assignee:: Devaraj Das

Reporter:: Runping Qi

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 28/Mar/07 22:08

Updated:: 06/Apr/07 16:54

Resolved:: 04/Apr/07 12:14