Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-5767

[Format] Permit dictionary replacements in IPC protocol

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • 0.16.0
    • Format
    • None

    Description

      We permit dictionaries to grow using the isDelta property in the IPC protocol. I think it should be allowed for the same dictionary ID to appear in an IPC protocol stream but with isDelta=false. This would indicate that the dictionary in that message is to replace any prior-observed ones in subsequent record batches.

      For example, we might have dictionary batches in a stream:

      id: 0 isDelta: false values: [a, b, c]
      id: 0 isDelta: true values [d]
      id 0 isDelta: false values [c, a, b]
      

      Such data could easily be produced by a stream producer that is creating dictionaries in different execution threads.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              wesm Wes McKinney
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: