Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Later
-
None
-
None
Description
RLE (run length encoding) is a widely used encoding/decoding technique. Compared with other encoding/decoding techniques, it is easier to work with the encoded data.
We want to provide an RLE vector implementation in Arrow. The design details include:
1. RleVector implements ValueVector.
2. the data structure of RleVector includes an inner vector, plus a buffer storing the end indices for runs.
3. we provide random access, with time complexity O(log), so it should not be used frequently.
4. In the future, we will provide iterators to access the vector in sequence.
5. RleVector does not support update, but supports appending.
6. In the future, we will provide encoder/decoder to efficiently transform encoded/decoded vectors.
Attachments
Issue Links
- links to