Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0-alpha-1, 2.4.8
-
None
-
Reviewed
Description
For CellChunkImmutableSegment, cells in data chunks is indexed in index chunks by chunk id ,index chunks use chunk id to locate the data chunk by look up chunk id in ChunkCreator.chunkIdMap . Whether or not the ChunkCreator put the data chunk in ChunkCreator.chunkIdMap depends on the data chunk is pooled or the index type is IndexType.CHUNK_MAP. So when we call ChunkCreator.getChunk, it is very important to specify the IndexType. ChunkCreator has following package-visible method, which does not require the IndexType parameter and use IndexType.ARRAY_MAP.
Chunk getChunk(ChunkType chunkType) {
return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType);
}
This method is used by following line 363 in MemStoreLABImpl.getNewExternalChunk now, which is used by CellChunkImmutableSegment.copyCellIntoMSLAB which index type is IndexType.CHUNK_MAP to force copy a big cell into MemStoreLAB, which should allocate a new not-pooled chunk.
359 public Chunk getNewExternalChunk(ChunkCreator.ChunkType chunkType) { 360 switch (chunkType) { 361 case INDEX_CHUNK: 362 case DATA_CHUNK: 363 Chunk c = this.chunkCreator.getChunk(chunkType); 364 chunks.add(c.getId()); 365 return c; 366 case JUMBO_CHUNK: // a jumbo chunk doesn't have a fixed size 367 default: 368 return null; 369 } 370 }
Even though fortunately this IndexType mismatch does not cause bug now because MemStoreLABImpl.getNewExternalChunk in fact is not invoked by CellChunkImmutableSegment.copyCellIntoMSLAB when the cell is bigger than chunkSize,but I think it is dangerous to not specify the IndexType if we want to refactor or add new functionalities in the future. We would better to specify the IndexType explicitly when we use ChunkCreator.getChunk.
When thinking this problem more, I think we could remove the IndexType from ChunkCreator completely to simplify the code, because when use MemStoreLABImpl, the IndexType could only be IndexType.CHUNK_MAP ,just as following line 106 in MemStoreLABImpl ctor.
96 public MemStoreLABImpl(Configuration conf) { 97 dataChunkSize = conf.getInt(CHUNK_SIZE_KEY, CHUNK_SIZE_DEFAULT); 98 maxAlloc = conf.getInt(MAX_ALLOC_KEY, MAX_ALLOC_DEFAULT); 99 this.chunkCreator = ChunkCreator.getInstance(); 100 // if we don't exclude allocations >CHUNK_SIZE, we'd infiniteloop on one! 101 Preconditions.checkArgument(maxAlloc <= dataChunkSize, 102 MAX_ALLOC_KEY + " must be less than " + CHUNK_SIZE_KEY); 103 104 // if user requested to work with MSLABs (whether on- or off-heap), then the 105 // immutable segments are going to use CellChunkMap as their index 106 idxType = CompactingMemStore.IndexType.CHUNK_MAP; 107 }
And because the get chunk from ChunkCreator is only used by MemStoreLABImpl, and only when we not use MemStoreLABImpl we could use IndexType.ARRAY_MAP, we can remove the IndexType from ChunkCreator completely.
Attachments
Issue Links
- relates to
-
HBASE-18375 The pool chunks from ChunkCreator are deallocated while in pool because there is no reference to them
- Closed
- links to