Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-13081

Portable representation of "packed bitset indicating null fields" in beam Row format is not compatible with jvm representations

Details

    • Bug
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • cross-language
    • None

    Description

      The JVM RowCoder strips trailing 0s from the null-value bitmap, while both python and go expect all bits to be present in the encoded bitmap.  This causes index-out-of-range errors when trying to decode a row encoded on the JVM in other languages in some circumstances.

      For example, given a Row with 10 nullable fields, if the first 8 are null and the last two are set, the row will fail to decode in python, because the nullable bitmap will only have 1 byte, but the python coder expects 2.

      As discussed in the thread, the best solution here is probably to change the python (and go) coders to accept truncated nullable bitmaps.

       

      More discussion here:

      https://lists.apache.org/thread.html/r2f148e29902bda8bb0ff7106fffb8a5494295450827ad7fd17289383%40%3Cdev.beam.apache.org%3E

      Attachments

        Activity

          People

            Unassigned Unassigned
            SteveNiemitz Steve Niemitz
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 7h 50m
                7h 50m