Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-1843

Clarify importance of writer's schema in documentation

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.9.0
    • doc
    • None

    Description

      I'll be submitting a PR with some improvements to the Java Getting Started page as well as the Specification which make it clearer that Avro must read all data with the writer's schema before converting it into the reader's schema and why, and explaining that's why the schema should be available next to serialized data. Currently, it's arguably too easy to misinterpret Avro as only requiring a single, reader's schema in order to read data while still following the resolution rules which make Avro seem similar to JSON (resolution by field name). For example, the Java API examples only appear to involve one schema, hiding the fact that it reads in the writer's schema implicitly. Also, the ability to serialize to JSON (where field names and some type info is present) makes this misconception easy to believe.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rehevkor5 Shannon Carey
            rehevkor5 Shannon Carey
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment