Uploaded image for project: 'Avro'
  1. Avro
  2. AVRO-620

Python implementation doesn't stringify sub-schemas correctly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: python
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In [9]: import avro.schema
      
      In [10]: s = avro.schema.parse('{"type": "record", "name": "X", "fields": [{"name": "y", "type": {"type": "record", "name": "Y", "fields": [{"name": "Z", "type": "X"}]}}]}')
      
      In [11]: str(s.fields[0].type)
      Out[11]: '{"fields": [{"type": "X", "name": "Z"}], "type": "record", "name": "Y"}'
      

      str(schema) is used in avro data files to record the schema. In the case above, when we serialize the schema for Y, we should actually also serialize the schema for X, since Y needs the schema for X.

      I ran smack into this when using a schema from a protocol to write a data file, and finding that a lot of the types weren't defined when looking at the avro data file generated.

        Attachments

        1. AVRO-620.patch.txt
          18 kB
          Philip Zeyliger

          Activity

            People

            • Assignee:
              philip Philip Zeyliger
              Reporter:
              philip Philip Zeyliger
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: