[HADOOP-16] RPC call times out while indexing map task is computing splits - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.1.0
Fix Version/s: 0.1.0
Component/s: None
Labels:
None
Environment:

MapReduce multi-computer crawl environment: 11 machines (1 master with JobTracker/NameNode, 10 slaves with TaskTrackers/DataNodes)

Description

We've been using Nutch 0.8 (MapReduce) to perform some internet crawling. Things seemed to be going well until...

060129 222409 Lost tracker 'tracker_56288'
060129 222409 Task 'task_m_10gs5f' has been lost.
060129 222409 Task 'task_m_10qhzr' has been lost.
........
........
060129 222409 Task 'task_r_zggbwu' has been lost.
060129 222409 Task 'task_r_zh8dao' has been lost.
060129 222455 Server handler 8 on 8010 caught: java.net.SocketException: Socket closed
java.net.SocketException: Socket closed
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:99)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
at java.io.DataOutputStream.flush(DataOutputStream.java:106)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:216)
060129 222455 Adding task 'task_m_cia5po' to set for tracker 'tracker_56288'
060129 223711 Adding task 'task_m_ffv59i' to set for tracker 'tracker_25647'

I'm hoping that someone could explain why task_m_cia5po got added to tracker_56288 after this tracker was lost.

The Crawl .main process died with the following output:

060129 221129 Indexer: adding segment: /user/crawler/crawl-20060129091444/segments/20060129200246
Exception in thread "main" java.io.IOException: timed out waiting for response
at org.apache.nutch.ipc.Client.call(Client.java:296)
at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
at $Proxy1.submitJob(Unknown Source)
at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:263)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)

However, it definitely seems as if the JobTracker is still waiting for the job to finish (no failed jobs).

Doug Cutting's response:
The bug here is that the RPC call times out while the map task is computing splits. The fix is that the job tracker should not compute splits until after it has returned from the submitJob RPC. Please submit a bug in Jira to help remind us to fix this.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

patch_h16.v0
28/Feb/06 08:46
20 kB
Mike Cafarella
patch.16
14/Feb/06 18:53
2 kB
Mike Cafarella

Issue Links

relates to

HADOOP-12 InputFormat used in job must be in JobTracker classpath (not loaded from job JAR)

Closed

Activity

People

Assignee:: Mike Cafarella

Reporter:: Chris Schneider

Votes:: 2 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 01/Feb/06 08:25

Updated:: 08/Jul/09 16:51

Resolved:: 03/Mar/06 08:09