Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Later
-
None
-
None
Description
The SegmentWriter keeps a records deduplication cache ('records' map) that maintains 2 types of mappings:
- template -> recordid
- strings -> recordid
For the first one (template-> recordid) we can come up with a thinner representation of a template (a hash function that is fast and not very collision prone) so we don't have to keep a reference to each template object.
Same applies for second one, similar to what is happening in the StringsCache now, we could keep the string value up to a certain size and beyond that, hash it and use that for the deduplication map.
Attachments
Attachments
Issue Links
- is blocked by
-
OAK-3958 Split SegmentWriter records cache into 2: strings and templates
- Closed