Uploaded image for project: 'UIMA'
  1. UIMA
  2. UIMA-5041

JsonCasSerializer creates duplicate shortname




      Our type system includes a type named "com.intersys.uima.annotation.iknow.TOP", which inherits directly from "uima.cas.TOP" and then has a number of subtypes specific to our AE. When serializing this through the JsonCasSerializer, it generates the shortname TOP twice:

      {"_types": [

      {"_id":"com.intersys.uima.annotation.iknow.TOP", "_subtypes":["Entity","ProximityScore"]}


      {"_id":"uima.cas.TOP", "_subtypes":["TOP","AnnotationBase","ArrayBase","Sofa"]}


      While we can work around this by renaming our top type, the documentation explicitly states this shouldn't pose a problem and shortnames would be de-duplicated automatically:
      https://uima.apache.org/d/uimaj-2.8.1/references.html#ugr.ref.json.overview Section 9.2.2:
      In the _types section, the key (e.g. "Sofa" or "A_Typical_User_or_built_in_Type") is the "short" name for the type used in the serialization. It is either just the last segment of the full type name (e.g. for the type x.y.z.TypeName, it's TypeName), or, if name would collide with another type name if just the last segment was used (example: some.package.cname.Foo, and some.other.package.cname.Foo), then the key is made up of the next-to-last segment, with an optional suffixed incrementing integer in case of collisions on that name, a colon ( and then the last name.

      I see there are unit test checking for this, but maybe it's because uima.cas.TOP is sort of a special case? Or because neither uima.cas.TOP nor our custom TOP is actually used directly (only subtypes are).
      So before I go ahead and change our root type name, I'd like to make sure this isn't something the framework should have taken care of itself.




            schor Marshall Schor
            bdeboe Benjamin De Boe
            0 Vote for this issue
            2 Start watching this issue

