Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
If someone creates and uses `ArrowBuffer.Builder<bool>` in their code to create an ArrowBuffer filled with Boolean values, it is currently producing the wrong results.
The reason it is producing the wrong results is because it is taking the `sizeof(bool)` (which is 1) and using that for how many bytes to write into the backing buffer for each element being added to the builder. However, in Arrow, Boolean values are stored in a bit-wise fashion allowing for 8 Boolean values in a single byte. Thus, when I add 4 `true` values to the buffer, I expect to get a buffer with 1 byte in it with the value 0x0F. However, I am getting a buffer with 4 bytes in it, each with value 0x01.
One way to fix this would be to throw in `ArrowBuffer.Builder<T>`'s constructor if `T` == `bool` and instead create a new class `ArrowBuffer.BooleanBuilder`, which will create Boolean buffers correctly. Looking at the current implementation, I think it would be rather hard to special case `typeof(bool)` all over in the `Builder` class, but if someone wanted to take that approach and made it work, that would be great too.
Attachments
Issue Links
- is superceded by
-
ARROW-8788 [C#] Array builders to use bit-packed buffer builder rather than boolean array builder for validity map
- Resolved