Description
This is follow up from OAK-7052 where we noticed that deleted blob list files collected by active deletion logic can grow very large due to inlined blobs.
One potential way (not sure how yet though) is to not actively delete inlined blobs.
Here are some stats which might help us take a call (based on raw numbers collected at [0])
file-name | large_lines | large_size | small_lines | small_size | small_lines/total_lines | small_size/total_size |
---|---|---|---|---|---|---|
blobs-1512664032264.txt | 245301 | 3310224358 | 173096 | 35473656 | 0.413712335413495 | 0.010602766852107 |
blobs-1512698405656.txt | 370373 | 4443957885 | 256775 | 52997864 | 0.409432861142824 | 0.011785275852845 |
blobs-1512987450004.txt | 660669 | 6214740439 | 461168 | 92017554 | 0.411082893504137 | 0.014590309966251 |
blobs-1513130410963.txt | 569083 | 5490965583 | 406756 | 80124598 | 0.416826956085994 | 0.014382211631264 |
blobs-1513216819447.txt | 69876 | 1413561892 | 46238 | 9221956 | 0.398212101899857 | 0.006481628262061 |
[0]:
file sizes
repository/index/deleted-blobs$ ls -l blobs-151* -rw-r--r-- 1 root root 3369065620 Dec 8 01:59 blobs-1512664032264.txt -rw-r--r-- 1 root root 4532250073 Dec 9 01:59 blobs-1512698405656.txt -rw-r--r-- 1 root root 6370201955 Dec 13 01:59 blobs-1512987450004.txt -rw-r--r-- 1 root root 1916223582 Dec 13 11:52 blobs-1513130410963.txt
number of entries
repository/index/deleted-blobs$ wc -l blobs-151* 418397 blobs-1512664032264.txt 627148 blobs-1512698405656.txt 1121837 blobs-1512987450004.txt 308292 blobs-1513130410963.txt 2475674 total
number of entries and sizes split on threshold of 500 bytes of blob ids
repository/index/deleted-blobs$ for i in blobs-151*;do echo $i;awk 'BEGIN {FS="|"} {len = length($1); if (len > 500) {large++; largeSize+=len} else {small++; smallSize+=len}} END {print large, largeSize, small, smallSize}' $i;done blobs-1512664032264.txt 245301 3310224358 173096 35473656 blobs-1512698405656.txt 370373 4443957885 256775 52997864 blobs-1512987450004.txt 660669 6214740439 461168 92017554 blobs-1513130410963.txt 569083 5490965583 406756 80124598 blobs-1513216819447.txt 69876 1413561892 46238 9221956
Attachments
Issue Links
- is related to
-
OAK-7052 Active deletion purge can OOM if number of blobs listed in a file become too large
- Closed