[HDFS-1104] Fsck triggers full GC on NameNode - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.21.0
Fix Version/s: 0.21.0
Component/s: namenode
Labels:
None

Hadoop Flags:

Reviewed

Description

A NameNode at one of our clusters fell into full GC while fsck was performed. Digging into the problem shows that it is caused by how NameNode handles the access time of a file.

Fsck calls open on every file in the checked directory to get the file's block locations. Each open changes the file's access time and then leads to writing a transaction entry to the edit log. The current code optimizes open so that it returns without issuing synchronizing the edit log to the disk. It happened that in our cluster no other jobs were running while fsck was performed. No edit log sync was ever called. So all open transactions were kept in memory. When the edit log buffer got full, it automatically doubled its space by allocating a new buffer. Full GC happened when no contiguous space were found when allocating a new bigger buffer.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

fsckATime_Yahoo0.20.patch
03/May/10 23:39
6 kB
Hairong Kuang
fsckATime.patch
27/Apr/10 22:50
3 kB
Hairong Kuang
fsckATime1.patch
28/Apr/10 23:08
4 kB
Hairong Kuang
fsckATime2.patch
29/Apr/10 19:12
5 kB
Hairong Kuang

Activity

People

Assignee:: Hairong Kuang

Reporter:: Hairong Kuang

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 21/Apr/10 22:12

Updated:: 24/Aug/10 20:52

Resolved:: 30/Apr/10 22:07