[OAK-5056] Improve GC scalability on TarMK - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Epic
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.8.0
Component/s: segment-tar
Labels:
- gc
- scalability

Epic Name:
TarMK GC scalability

Description

This issue is about making TarMK gc more scalable:

how to deal with huge repositories.
how to deal with massive concurrent writes.
how can we improve monitoring to determine gc health.
- Monitor deduplication caches (e.g. deduplication of checkpoints)

Possible avenues to explore:

Can we partition gc? (e.g. along sub-trees, along volatile vs. static content)
Can we pause and resume gc? (e.g. to give precedence to concurrent writes)
Can we make gc a real background process not contending with foreground operations?

This issue is a follow up to ~~OAK-2849~~, which was about efficacy of gc.

Attachments

Activity

People

Assignee:: Michael Dürig

Reporter:: Michael Dürig

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 03/Nov/16 13:40

Updated:: 06/Feb/18 20:53

Resolved:: 06/Feb/18 20:53