Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-4743

errors in plain binary cas delta serialization of short/long modifications to arrays

    XMLWordPrintableJSON

Details

    Description

      (Found by code reading, needs test case) The code in class CASSerializer for handling long/dbl value modified cells in arrays while doing delta serialization appears to have a some copy/paste kinds of errors. The first (line 583) attempts to get the collection of modified addrs in the long heap,but gets the "short" rather than "long" addrs. The second (line 587) writes the address of the modified value using a writeShort, which both silently fails if the value is > 32767, and only writes 2 bytes, while the corresponding "read" (in CASImpl after comment "//modified Short heap" reads the addresses as an integer. This makes all reading after this off by 2.

      This same kind of error (writing the address as a short) also appears in the handling of modifications of shorts (line 570).

      Fixing this will result in writing more bytes to the serialization stream, so the streams won't be "compatible". Therefore, do some kind of incrementing of versions and serialVersionId values to signal to readers this format change. Updating this for client/server pairs will require updates at both ends.

      This bug causes failure of deserialization, if any short or long array values are modified and delta serialization is being used, so needs to be fixed.

      Added a new test case to check this area and found another issue: the rounding to word boundaries while serializing delta changes to byte and short arrays was incorrect. Fix that as well.

      Attachments

        Issue Links

          Activity

            People

              schor Marshall Schor
              schor Marshall Schor
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: