[HADOOP-1651] Some improvements in progress reporting - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.15.0
Component/s: None
Labels:
None

Description

Some improvements that can be done:
1) Progress reporting interval can be made slightly large. It is currently 1 second. Propose to make it 3 seconds to reduce the load on the TaskTracker.
2) Progress reports can potentially be missed. In the loop, if the first attempt at reporting a progress doesn't go through, it is not retried. The next communication will be a 'ping'.
3) If there is an exception while reporting progress or doing ping, the client should sleep for sometime before retrying.
4) The TaskUmbilicalProtocol client can always stay connected to the server. Currently, the default idle timeout on the IPC client is set to 1000 msec (this means that the client will disconnect if the connection has been idle for 1000 msec). This might lead to unnecessary tearing-down/setting-up of connections for the TaskUmbilicalProtocol and can be avoided by having a high idle timeout for this protocol. The idea behind having the idle timeout was to not hold on to server connections unnecessarily and hence be more scalable when there are 1000s of clients, especially applicable to those protocols involving the JT and the NameNode. We don't run into scalability issues with TaskUmbilical protocol since it is limited to a few Tasks and the corresponding TaskTracker.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

1651.patch
25/Jul/07 12:04
5 kB
Devaraj Das
1651.3.patch
27/Jul/07 06:55
4 kB
Devaraj Das
1651.2.patch
26/Jul/07 18:07
3 kB
Devaraj Das
1651.1.patch
25/Jul/07 18:47
3 kB
Devaraj Das

Issue Links

relates to

HADOOP-1586 Progress reporting thread can afford to be slightly lenient towards exceptions other than ConnectException

Closed

Activity

People

Assignee:: Devaraj Das

Reporter:: Devaraj Das

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 25/Jul/07 05:58

Updated:: 08/Jul/09 16:52

Resolved:: 07/Aug/07 20:43