[HBASE-12626] Archive cleaner cannot keep up; it maxes out at about 400k deletes/hour - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.94.25
Fix Version/s: None
Component/s: master, scaling
Labels:
None

Description

On big clusters, it is possible to overrun the archive cleaning thread. Make it able to do more work per cycle when needed.

We saw this on a user's cluster. The rate at which files were being moved to the archive exceeded our delete rate such that the archive had tens of millions of files putting a friction on all cluster ops.

The cluster had ~500 nodes. It that was RAM constrained (other processes on box also need RAM). Over a period of days, the loading was thrown off kilter because it started taking double writes going from one schema to another (Cluster was running hot before the double loading). The master was deleting an archived file every 9ms on average, about 400k deletes an hour. The constrained RAM and their having 4-5 column famiilies had them creating files in excess of this rate so we backed up.

For some helpful background/input, see the dev thread http://search-hadoop.com/m/DHED4UYSF9

Attachments

Issue Links

relates to

HBASE-8963 Add configuration option to skip HFile archiving

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Michael Stack

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 03/Dec/14 20:09

Updated:: 17/Jun/22 18:35

Resolved:: 11/Jun/22 19:17