Avro
  1. Avro
  2. AVRO-620

Python implementation doesn't stringify sub-schemas correctly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: python
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In [9]: import avro.schema
      
      In [10]: s = avro.schema.parse('{"type": "record", "name": "X", "fields": [{"name": "y", "type": {"type": "record", "name": "Y", "fields": [{"name": "Z", "type": "X"}]}}]}')
      
      In [11]: str(s.fields[0].type)
      Out[11]: '{"fields": [{"type": "X", "name": "Z"}], "type": "record", "name": "Y"}'
      

      str(schema) is used in avro data files to record the schema. In the case above, when we serialize the schema for Y, we should actually also serialize the schema for X, since Y needs the schema for X.

      I ran smack into this when using a schema from a protocol to write a data file, and finding that a lot of the types weren't defined when looking at the avro data file generated.

      1. AVRO-620.patch.txt
        18 kB
        Philip Zeyliger

        Activity

          People

          • Assignee:
            Philip Zeyliger
            Reporter:
            Philip Zeyliger
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development