[HBASE-26250] Automatic and near real-time healing of locality - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

I’m proposing a somewhat major new tool for quickly and efficiently alleviating latency pains due to locality. This is especially useful in cloud environments, and has been highly impactful at HubSpot, where we run thousands of RegionServers across 40+ multi-zone clusters. Please see the attached design doc for details on the problem, why compactions are not enough to solve the problem, and an overview (with diagram) of the components that make up this new tool.

As spec'd, this new feature would require submission of a new tool in the HDFS project. Once we reach consensus on the approach I can create the relevant upstream HDFS JIRA.

See the design doc here: https://docs.google.com/document/d/1GLGzrF1QLyhyOCr2fFw0LCymnyFPT0ktShTaaXn-75A/edit#heading=h.aswo7shg76b6

Note: This issue is an attempt to upstream a tool that has been fully deployed for all clusters in production at HubSpot for about 6 months. It's been very effective for us as currently implemented, but will need to be re-organized and re-designed a bit to fit into the HBase/HDFS projects. As such I'd like feedback on the design before putting in too much effort on porting multiple components into PRs.

Attachments

Issue Links

is related to

HDFS-16261 Configurable grace period around invalidation of replaced blocks

Open

HDFS-16155 Allow configurable exponential backoff in DFSInputStream refetchLocations

Open

HDFS-16262 Async refresh of cached locations in DFSInputStream

Resolved

links to

Design Doc

Sub-Tasks

Reflect out-of-band locality improvements in served requests

Resolved

Bryan Beaudreault

Activity

People

Assignee:: Bryan Beaudreault

Reporter:: Bryan Beaudreault

Votes:: 0 Vote for this issue

Watchers:: 16 Start watching this issue

Dates

Created:: 02/Sep/21 14:18

Updated:: 06/Oct/22 07:22