Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5767

[Format] Permit dictionary replacements in IPC protocol

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 1.0.0
    • Component/s: Format
    • Labels:
      None

      Description

      We permit dictionaries to grow using the isDelta property in the IPC protocol. I think it should be allowed for the same dictionary ID to appear in an IPC protocol stream but with isDelta=false. This would indicate that the dictionary in that message is to replace any prior-observed ones in subsequent record batches.

      For example, we might have dictionary batches in a stream:

      id: 0 isDelta: false values: [a, b, c]
      id: 0 isDelta: true values [d]
      id 0 isDelta: false values [c, a, b]
      

      Such data could easily be produced by a stream producer that is creating dictionaries in different execution threads.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              wesmckinn Wes McKinney
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: