[HADOOP-35] Files missing chunks can cause mapred runs to get stuck - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: 0.1.0
Fix Version/s: 0.1.0
Component/s: None
Labels:
None
Environment:

~20 datanode DFS cluster

Description

I've now several times run into a problem where a large run gets stalled as a result of a missing data block. The latest was a stall in the Summer - ie, the data might've all been there, but it was impossible to proceed because the CRC file was missing a block. It would be nice to:

1) Have a "health check" running on a map reduce. If any data isn't available, emmit periodic warnings, and maybe have a timeout for if the data never comes back. Such warnings should specify which file(s) are affected by the missing blocks.
2) Have a utility, possible part of the existing dfs utility, which can check for dfs files with unlocatable blocks. Possibly, even show a 'health' of a file - ie, what percentage of its blocks are currently at the desired replication level. Currently, there's no way that I know of to find out if a file in DFS is going to be unreadable.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

dfsshell.health.patch.txt
14/Feb/06 08:28
3 kB
Bryan Pendleton

Issue Links

is duplicated by

HADOOP-83 infinite retries accessing a missing block

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Bryan Pendleton

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 14/Feb/06 03:30

Updated:: 08/Jul/09 16:41

Resolved:: 25/Mar/06 05:33