Uploaded image for project: 'Avro'
  1. Avro
  2. AVRO-160

file format should be friendly to streaming

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: spec
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change

      Description

      It should be possible to stream through an Avro data file without seeking to the end.

      Currently the interpretation is that schemas written to the file apply to all entries before them. If this were changed so that they instead apply to all entries that follow, and the initial schema is written at the start of the file, then streaming could be supported.

      Note that the only change permitted to a schema as a file is written is to, if it is a union, to add new branches at the end of that union. If it is not a union, no changes may be made. So it is still the case that the final schema in a file can read every entry in the file and thus may be used to randomly access the file.

        Attachments

        1. AVRO-160.patch
          41 kB
          Doug Cutting
        2. AVRO-160.patch
          41 kB
          Doug Cutting
        3. AVRO-160.patch
          39 kB
          Doug Cutting
        4. AVRO-160.patch
          33 kB
          Doug Cutting
        5. AVRO-160-python.patch
          15 kB
          Jeff Hammerbacher

          Issue Links

            Activity

              People

              • Assignee:
                cutting Doug Cutting
                Reporter:
                cutting Doug Cutting
              • Votes:
                1 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: