Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3512

aliases to the null namespace do not work as expected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 1.11.0
    • None
    • java, spec

    Description

      the avro spec allows for the "null namespace" (when no namespace is specified anywhere). it also has the following to say about aliases:

      if a type named "a.b" has aliases of "c" and "x.y", then the fully qualified names of its aliases are "a.c" and "x.y"

      which means a "simple" alias ("c" above) inherits any namespace defined on the declaring type.

       

      now suppose i was to use aliases on a namespaced schema to be able to read data written using a schema that is in the null namespace (has no namespace).

      here are my writer schema:

      {
        "type": "record",
        "name": "AncientSchema",
        "fields": [
          {
            "name" : "enumField",
            "type" : {
              "type" : "enum",
              "name" : "AncientEnum",
              "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ]
            }
          }
        ]
      }
      

      and reader schema:

      {
        "type": "record",
        "namespace": "much.namespace",
        "name": "ModernRecord",
        "fields": [
          {
            "name" : "enumField",
            "type" : {
              "type" : "enum",
              "name" : "ModernEnum",
              "symbols" : [ "THE", "SPEC", "IS", "A", "LIE" ],
              "aliases": [
                 ".AncientEnum"
              ]
            }
        ],
        "aliases": [
          ".AncientSchema"
        ]
      }
      

      notice the dots used in the aliases. as far as i understand the spec this should be the only legal way to do this. and it does indeed work .... to a point.

       

      when testing this i found multiple issues with avro's handling of such aliases, dating back to late avro 1.7.*

       

      1. without these aliases, decoding does fail, but it fails over the nested enum, whereas it should have failed "immediately" on the fullname mismatch on the top level record schema. in fact, on further testing i think avro (at least in java) doesnt bother comparing the fullnames on the top level writer vs reader schemas at all?
      2. while the schema with the aliases parse()es fine, Schema.toString() strips out the dots from the aliases, thereby creating a "monsanto terminator schema" - once printed and parsed again the aliases would become "simple aliases" and stop working
      3. the spec doesnt explicitly talk about how to use aliases to "target" the null namespace. if this is an intentional feature I think the spec should be expanded a little to cover it?

       

      i have code to reproduce all these issues in https://github.com/radai-rosenblatt/avro/blob/aliasing-to-null-namespace/lang/java/avro/src/test/java/org/apache/avro/TestAliasToNullNamespace.java (coded against master)

       

      i also have code to reproduce all the above against multiple older avro versions in https://github.com/linkedin/avro-util/blob/master/helper/tests/helper-tests-allavro/src/test/java/com/linkedin/avroutil1/compatibility/AvroTypeAliasesTest.java

      Attachments

        1. AVRO-3512.patch
          0.9 kB
          Radai Rosenblatt

        Issue Links

          Activity

            People

              Unassigned Unassigned
              radai Radai Rosenblatt
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m