Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3528

Optionally support strict LogicalType parsing

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.11.0
    • 1.13.0
    • java

    Description

      My organization uses Avro schemas extensively. We use Confluent schema registry for data governance, enforcing data contracts between various components. We are seeing proliferation of questionable LogicalType structures within our schemas, like the following:

      {
        "namespace": "org.apache.avro.example",
        "type": "record",
        "name": "BadLogical",
        "fields": [
         {
           "name": "f0",
           "type":{
             "type": "string",
             "java-class": "java.math.BigDecimal",
             "logicalType": "decimal",
             "precision": 9,
             "scale": 2
           }
         }
        ]
      } 

      There are two issues in the above structure:

      1. string is not allowed to back the decimal LogicalType
      2. java-class property and some others, are incompatible for any LogicalType

      Currently Avro allows such structures to pass validation and has no option to disallow them. Since Confluent schema registry delegates all avro schema validation to Avro, these structures are allowed to be registered.

      Proposition:

      Implement an option to switch avro Schema.Parser to LogicalType strict mode where such structures will be detected and disallowed.

      The change is fairly trivial, most of the plumbing is already there. I have implemented the required functionality and covered it with with tests, but I would like to solicit some feedback on this proposal before submitting my PR.

      Attachments

        Issue Links

          Activity

            People

              AndreiLeib Andrei Leibovski
              AndreiLeib Andrei Leibovski
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m