Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently for Blob GC in case of segment SegmentBlobReferenceRetriever goes through all tar files and extracts the binary references. This has 2 issues
- Logic has go through all the segments in all tar files
- All segments get loaded in memory once which would affect normal system performance
This process can be optimized if we also write a file entry in tar (similar to gph i.e. graph and idx i.e. index files) which has entries of all binary references referred to in any segment present in that tar file. Then GC logic would just have read this file and avoid scanning all the segments