Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
This Set takes 40 bytes per entry (block). As of now the total heap requirement per entry is 160. If we can avoid this Set it is 25% reduction. This set is used for removal of blocks for a specific HFile after its invalidation (Mostly because of its compaction or by Store close). Check other ways to remove the blocks. May be in an async way after the compaction is over by a dedicated cleaner thread It might be ok not to remove the invalidated file's entries immediately. When the cache is out of space, the Eviction thread might select it and remove. Few things to consider/change
1. When compaction process reads blocks , it might be delivered from cache. We should not consider this access as a real block access for this block. That will increase the chances of eviction thread selecting this block for removal. We should be able to distinguish the Cache read by compaction process/user read process clearly
2. When the compaction process reads a block from cache, some way we can mark this block (using one byte boolean) that it is just went with the compaction? When later the Eviction thread to select a block and if there is tie because of same access time/count, we can break this tie in favor of selecting the already compacted block? Need to check its pros and cons.