[HADOOP-2492] ConcurrentModificationException in org.apache.hadoop.ipc.Server.Responder - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.16.0
Fix Version/s: 0.16.0
Component/s: ipc
Labels:
None

Description

I was running hadoop on 800 machines and after running a couple of jobs, and running 100% of the maps of the current job, the JobTracker stopped responding - all tasktrackers were lost ... When I looked at the JT logs, these seemed alarming:
2007-12-26 19:18:30,185 WARN org.apache.hadoop.ipc.Server: Exception in Responder java.util.ConcurrentModificationException
Following the above exception, I saw a whole lot of exceptions like:
2007-12-26 19:23:10,926 WARN org.apache.hadoop.ipc.Server: Call queue overflow discarding oldest call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus@5a05f9, false, true, 1758) from 1.2.3.4:1234

From the number of exceptions to do with call queue overflow, it seemed like the jobtracker was not processing RPCs after it got the ConcurrentModificationException, and around that time the tasktrackers started getting timeouts on RPCs...

There were two occurrences of the ConcurrentModificationException but the first instance seemed to not have any effect on the call queue...

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

rpcexception.patch
29/Dec/07 06:25
1 kB
Dhruba Borthakur

Activity

People

Assignee:: Dhruba Borthakur

Reporter:: Devaraj Das

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 27/Dec/07 07:33

Updated:: 08/Feb/08 23:38

Resolved:: 01/Jan/08 18:37