Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.1.0
    • Component/s: Format
    • Labels:
      None

      Description

      format/Messages.fbs mentions DictionaryBatches with an id but does not specify where they are referenced.

      We should add a dictionary: long in Field that references the dictionary id:

      Field: https://github.com/apache/arrow/blob/34e7f48cb71428c4d78cf00d8fdf0045532d6607/format/Message.fbs#L86

      Dictionary id: https://github.com/apache/arrow/blob/34e7f48cb71428c4d78cf00d8fdf0045532d6607/format/Message.fbs#L165

      We need a spec in format/Layout.md that describes the dictionary layout.
      When dictionary encoded the value vector is an array of signed int32 (for consistency with variable length collection offsets).
      The dictionary vector is a Vector of the type of the value. indexed by their id in the dictionary.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                julienledem Julien Le Dem
                Reporter:
                julienledem Julien Le Dem
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: