[HBASE-10191] Move large arena storage off heap - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Umbrella
Status: Closed
Priority: Major
Resolution: Implemented
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Even with the improved G1 GC in Java 7, Java processes that want to address large regions of memory while also providing low high-percentile latencies continue to be challenged. Fundamentally, a Java server process that has high data throughput and also tight latency SLAs will be stymied by the fact that the JVM does not provide a fully concurrent collector. There is simply not enough throughput to copy data during GC under safepoint (all application threads suspended) within available time bounds. This is increasingly an issue for HBase users operating under dual pressures: 1. tight response SLAs, 2. the increasing amount of RAM available in "commodity" server configurations, because GC load is roughly proportional to heap size.

We can address this using parallel strategies. We should talk with the Java platform developer community about the possibility of a fully concurrent collector appearing in OpenJDK somehow. Set aside the question of if this is too little too late, if one becomes available the benefit will be immediate though subject to qualification for production, and transparent in terms of code changes. However in the meantime we need an answer for Java versions already in production. This requires we move the large arena allocations off heap, those being the blockcache and memstore. On other JIRAs recently there has been related discussion about combining the blockcache and memstore (~~HBASE-9399~~) and on flushing memstore into blockcache (~~HBASE-5311~~), which is related work. We should build off heap allocation for memstore and blockcache, perhaps a unified pool for both, and plumb through zero copy direct access to these allocations (via direct buffers) through the read and write I/O paths. This may require the construction of classes that provide object views over data contained within direct buffers. This is something else we could talk with the Java platform developer community about - it could be possible to provide language level object views over off heap memory, on heap objects could hold references to objects backed by off heap memory but not vice versa, maybe facilitated by new intrinsics in Unsafe. Again we need an answer for today also. We should investigate what existing libraries may be available in this regard. Key will be avoiding marshalling/unmarshalling costs. At most we should be copying primitives out of the direct buffers to register or stack locations until finally copying data to construct protobuf Messages. A related issue there is ~~HBASE-9794~~, which proposes scatter-gather access to KeyValues when constructing RPC messages. We should see how far we can get with that and also zero copy construction of protobuf Messages backed by direct buffer allocations. Some amount of native code may be required.

Attachments

Issue Links

is related to

HDFS-6709 Implement off-heap data structures for NameNode and other HDFS memory optimization

Open

HBASE-5311 Allow inmemory Memstore compactions

Closed

HBASE-9399 Up the memstore flush size

Closed

HBASE-9794 KeyValues / cells backed by buffer fragments

Closed

HBASE-10204 Offheap efforts

Closed

relates to

HBASE-11351 Experiments with OffHeap memstores

Closed

HBASE-10713 A MemStore implementation with in memory flushes to CellBlocks

Closed

HBASE-10771 Primitive type put/get APIs in ByteRange

Closed

HBASE-11710 if offheap L2 blockcache, read from DFS into offheap and copy to blockcache without coming onheap

Closed

HBASE-10353 Alternative I/O Engine

Closed

(5 relates to)

Sub-Tasks

1.	Pluggable Memstore	Closed	Anoop Sam John
2.	Cell interface may need a ByteBuffer rather than a byte[]	Closed	Unassigned
3.	Pluggable MemStoreLAB	Closed	Anoop Sam John
4.	Use ByteRanges instead of ByteBuffers in BlockCache	Closed	Unassigned
5.	Make use of ByteRanges in HFileBlock instead of ByteBuffers	Closed	Unassigned
6.	Use Netty 4	Closed	Nicolas Liochon
7.	ByteRange-fronted slab allocator for on- and off-heap cellblock storage	Closed	Unassigned

Activity

People

Assignee:: Unassigned

Reporter:: Andrew Kyle Purtell

Votes:: 0 Vote for this issue

Watchers:: 42 Start watching this issue

Dates

Created:: 17/Dec/13 23:27

Updated:: 17/Jun/22 04:54

Resolved:: 11/Jun/22 23:27