Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The compaction bit rely on the SegmentBlob#clone method in the case a binary is being processed but it looks like the #clone contract is not fully enforced for streams that are qualified as 'long values' (>16k if I read the code correctly).
What happens is the stream is initially persisted as chunks in a ListRecord. When compaction calls #clone it will get back the original list of record ids, which will get referenced from the compacted node state [0], making compaction on large binaries ineffective as the bulk segments will never move from the original location where they were created, unless the reference node gets deleted.
I think the original design was setup to prevent large binaries from being copied over but looking at the size problem we have now it might be a good time to reconsider this approach.