Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
format/Messages.fbs mentions DictionaryBatches with an id but does not specify where they are referenced.
We should add a dictionary: long in Field that references the dictionary id:
Field: https://github.com/apache/arrow/blob/34e7f48cb71428c4d78cf00d8fdf0045532d6607/format/Message.fbs#L86
Dictionary id: https://github.com/apache/arrow/blob/34e7f48cb71428c4d78cf00d8fdf0045532d6607/format/Message.fbs#L165
We need a spec in format/Layout.md that describes the dictionary layout.
When dictionary encoded the value vector is an array of signed int32 (for consistency with variable length collection offsets).
The dictionary vector is a Vector of the type of the value. indexed by their id in the dictionary.
Attachments
Issue Links
- links to