Avro
  1. Avro
  2. AVRO-620

Python implementation doesn't stringify sub-schemas correctly

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.4.0
    • Component/s: python
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In [9]: import avro.schema
      
      In [10]: s = avro.schema.parse('{"type": "record", "name": "X", "fields": [{"name": "y", "type": {"type": "record", "name": "Y", "fields": [{"name": "Z", "type": "X"}]}}]}')
      
      In [11]: str(s.fields[0].type)
      Out[11]: '{"fields": [{"type": "X", "name": "Z"}], "type": "record", "name": "Y"}'
      

      str(schema) is used in avro data files to record the schema. In the case above, when we serialize the schema for Y, we should actually also serialize the schema for X, since Y needs the schema for X.

      I ran smack into this when using a schema from a protocol to write a data file, and finding that a lot of the types weren't defined when looking at the avro data file generated.

      1. AVRO-620.patch.txt
        18 kB
        Philip Zeyliger

        Activity

        Doug Cutting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Philip Zeyliger made changes -
        Hadoop Flags [Reviewed]
        Status Open [ 1 ] Resolved [ 5 ]
        Assignee Philip Zeyliger [ philip ]
        Fix Version/s 1.4.0 [ 12314789 ]
        Resolution Fixed [ 1 ]
        Philip Zeyliger made changes -
        Field Original Value New Value
        Attachment AVRO-620.patch.txt [ 12452720 ]
        Philip Zeyliger created issue -

          People

          • Assignee:
            Philip Zeyliger
            Reporter:
            Philip Zeyliger
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development