Avro
  1. Avro
  2. AVRO-838

Support reading of files created with Avro 1.5 that use invalid characters in field and record names

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.5.0, 1.5.1
    • Fix Version/s: 1.5.2
    • Component/s: java
    • Labels:
      None

      Description

      Avro 1.4 had a bug that let users create schemas with invalid characters in field and record names.

      For example, the '-' character used to be allowed in field and record names, but Avro 1.5 will now fail.

      We need some kind of compatibility mode that supports schemas with invalid characters, so that existing files created with Avro 1.4 can be read by Avro 1.5.

      1. AVRO-838.patch
        3 kB
        Doug Cutting

        Activity

        Hide
        Doug Cutting added a comment -

        Here's a patch for this.

        Can folks please verify that this permits them to open files whose schemas contain illegal names?

        Also, should we try to convert these names to valid names, e.g., replacing '-' with '_' or somesuch?

        Show
        Doug Cutting added a comment - Here's a patch for this. Can folks please verify that this permits them to open files whose schemas contain illegal names? Also, should we try to convert these names to valid names, e.g., replacing '-' with '_' or somesuch?
        Hide
        Doug Cutting added a comment -

        The current patch disables all name validation for schemas read from data files. This makes Java tolerant in what it accepts from other Avro implementations, including older versions of Java.

        Show
        Doug Cutting added a comment - The current patch disables all name validation for schemas read from data files. This makes Java tolerant in what it accepts from other Avro implementations, including older versions of Java.
        Hide
        Doug Cutting added a comment -

        Has anyone tried this? Should I just commit it?

        Show
        Doug Cutting added a comment - Has anyone tried this? Should I just commit it?
        Hide
        Ken Krugler added a comment -

        Hi Doug,

        Sorry for the delay. Looks like it works, thanks!

        For the record, we tried it out on an Avro file created with 1.4 (and invalid chars in the record name):

        java -jar avro-tools-1.6.0-SNAPSHOT.jar getschema ptd-sample.avro

        and it successfully dumped the schema.

        When we tried the same with avro-tools-1.5.1 it generated the following error:

        Exception in thread "main" org.apache.avro.SchemaParseException: Illegal character in: Cascading-Schema-0

        Show
        Ken Krugler added a comment - Hi Doug, Sorry for the delay. Looks like it works, thanks! For the record, we tried it out on an Avro file created with 1.4 (and invalid chars in the record name): java -jar avro-tools-1.6.0-SNAPSHOT.jar getschema ptd-sample.avro and it successfully dumped the schema. When we tried the same with avro-tools-1.5.1 it generated the following error: Exception in thread "main" org.apache.avro.SchemaParseException: Illegal character in: Cascading-Schema-0
        Hide
        Doug Cutting added a comment -

        I committed this.

        Show
        Doug Cutting added a comment - I committed this.

          People

          • Assignee:
            Doug Cutting
            Reporter:
            Ken Krugler
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development