Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7829

AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation

Details

    Description

      While trying to use an Avro PCollection with the SQL transform I notice you could not do correctly a bijective transform: PCollection<GenericRecord> -> SQL -> PCollection<Row> -> ParDo -> PCollection<GenericRecord> I noticed that some of the Avro metadata gets lost in particular the name of the Avro Schema. This is important because Avro validates that the schema has a name and if it does not it breaks with a ParseException.

      org.apache.avro.SchemaParseException: Illegal character in: EXPR$1
          at org.apache.avro.Schema.validateName (Schema.java:1151)
          at org.apache.avro.Schema.access$200 (Schema.java:81)
          at org.apache.avro.Schema$Field.<init> (Schema.java:403)
          at org.apache.avro.Schema$Field.<init> (Schema.java:423)
          at org.apache.avro.Schema$Field.<init> (Schema.java:415)

      Attachments

        Issue Links

          Activity

            People

              rskraba Ryan Skraba
              iemejia Ismaël Mejía
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m