The current buffer allocation policy works like this:
- If the requested buffer size is greater than or equal to the chunk size, the buffer size will be as is.
- If the requested size is within the chunk size, the buffer size will be rounded to the next power of 2.
This policy can lead to waste of memory in some cases. For example, if we request a buffer of size 10MB, Arrow will round the buffer size to 16 MB. If we only need 10 MB, this will lead to a waste of (16 - 10) / 10 = 60% of memory.
So in this proposal, we provide another policy: the rounded buffer size must be a multiple of some memory unit, like (32 KB). This policy has two benefits:
- The wasted memory cannot exceed one memory unit (32 KB), which is much smaller than the power-of-two policy.
- This is the memory allocation policy adopted by some computation engines (e.g. Apache Flink).