Avro
  1. Avro
  2. AVRO-1395

Order of fields returned by DataFileReader should match the order defined in Avro schema

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.7.5
    • Fix Version/s: None
    • Component/s: python
    • Labels:
      None

      Description

      Python `DataFileReader` class allows to iterate over records of Avro file where each record is represented by a standard Python dictionary - each element corresponds to a single field of the record. Note that this dictionary does not define any particular order of its elements. I claim that it would be better if the order of elements followed the order of fields as defined in file's Avro schema. In such case, presentation of the record would be more human-friendly, which is important e.g. for a user that wants to view the record as a JSON string. Consider that order of fields in Avro schema usually has some significance, i.e., the most important fields, (like ID), are at the beginning.

      Implementing the functionality of having the representation of the record follow the order of fields defined in Avro schema seems to be pretty easy, i.e., as of Avro 1.7.5 in `io.DatumReader` class in `read_record` method, you would only have to change the second line of code from `read_record = {}` to `read_record = OrderedDict()`.

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Mateusz Kobos
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development