Avro
  1. Avro
  2. AVRO-153

Naming conventions for avro schemas, records and protocols

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: spec
    • Labels:
      None

      Description

      It would be nice to add a few paragraphs to the spec with suggested naming conventions. I'm ambivalent as to what they actually are, but while the paint hasn't fully settled, it might be nice to lead the project in one direction or another.

      Any thoughts on what the best style here is?

        Issue Links

          Activity

          Hide
          Doug Cutting added a comment -

          We currently define identifiers with:

          Record, field and enum names must:

          • start with [A-Za-z_]
          • subsequently contain only [A-Za-z0-9_]

          We however don't say anything about namespaces. I propose that namespaces must be a series of dot-delimited identifiers as defined above.

          As for encouraged naming conventions, I'd opt for Java's:

          • namespaces are hierarchical, with the root at left, starting with a reversed domain name.
          • namespace elements are lowercase
          • record and enum names are capitalized CamelCase
          • field and message names are uncapitalized camelCase
          • enum symbols are all-cap
          • acronyms embeeded in names are capitalized, e.g., Md5Hash, BaseUrl, etc.
          Show
          Doug Cutting added a comment - We currently define identifiers with: Record, field and enum names must: start with [A-Za-z_] subsequently contain only [A-Za-z0-9_] We however don't say anything about namespaces. I propose that namespaces must be a series of dot-delimited identifiers as defined above. As for encouraged naming conventions, I'd opt for Java's: namespaces are hierarchical, with the root at left, starting with a reversed domain name. namespace elements are lowercase record and enum names are capitalized CamelCase field and message names are uncapitalized camelCase enum symbols are all-cap acronyms embeeded in names are capitalized, e.g., Md5Hash, BaseUrl, etc.
          Hide
          Jeff Hammerbacher added a comment -

          Pulling in some comments from Doug on the dev mailing list about the expanded scope for this issue:

          The spec should be updated. Schemas can specify a namespace. If they're nested in another schema or protocol then the namespace defaults to the namespace of the containing schema or protocol.

          Another thing that should be updated in the spec is that a name can be namespace-qualified. This is useful to refer to types in a different namespace, e.g., a field like:

          Unknown macro: {"name"}

          This is related to https://issues.apache.org/jira/browse/AVRO-153.

          The spec currently prohibits dots in identifiers. We should clarify that dots are permitted in namespaces, and that, if present in a name, the last dot separates the name from the namespace.

          Show
          Jeff Hammerbacher added a comment - Pulling in some comments from Doug on the dev mailing list about the expanded scope for this issue: The spec should be updated. Schemas can specify a namespace. If they're nested in another schema or protocol then the namespace defaults to the namespace of the containing schema or protocol. Another thing that should be updated in the spec is that a name can be namespace-qualified. This is useful to refer to types in a different namespace, e.g., a field like: Unknown macro: {"name"} This is related to https://issues.apache.org/jira/browse/AVRO-153 . The spec currently prohibits dots in identifiers. We should clarify that dots are permitted in namespaces, and that, if present in a name, the last dot separates the name from the namespace.
          Hide
          Doug Cutting added a comment -

          This is mostly a duplicate of AVRO-253, which has been fixed. The spec now better defines names and namespaces. The spec does not yet have recommendations about, e.g., use of capitalization within names, etc. Philip, do you still feel we should add that to the spec?

          Show
          Doug Cutting added a comment - This is mostly a duplicate of AVRO-253 , which has been fixed. The spec now better defines names and namespaces. The spec does not yet have recommendations about, e.g., use of capitalization within names, etc. Philip, do you still feel we should add that to the spec?
          Hide
          Patrick Linehan added a comment -

          I think it makes sense to recommend naming conventions. If not in the spec itself, at least in some kind of "style guide" quickly accessible from the main Avro web pages. Protobut has a style guide in this vein: http://code.google.com/apis/protocolbuffers/docs/style.html.

          It would also be good to include on that page the recommended file extensions for the various Avro files. As a n00b, I had to induce this convention from disparate pages on the web. The convention seems to be:

          *.avsc: Avro type schema
          *.avpr: Avro protocol schema
          *.avdl: Avro IDL file
          *.avro: Avro data file

          Show
          Patrick Linehan added a comment - I think it makes sense to recommend naming conventions. If not in the spec itself, at least in some kind of "style guide" quickly accessible from the main Avro web pages. Protobut has a style guide in this vein: http://code.google.com/apis/protocolbuffers/docs/style.html . It would also be good to include on that page the recommended file extensions for the various Avro files. As a n00b, I had to induce this convention from disparate pages on the web. The convention seems to be: *.avsc: Avro type schema *.avpr: Avro protocol schema *.avdl: Avro IDL file *.avro: Avro data file
          Hide
          Doug Cutting added a comment -

          +1 for a style guide & standard documenting file extensions.

          Show
          Doug Cutting added a comment - +1 for a style guide & standard documenting file extensions.

            People

            • Assignee:
              Unassigned
              Reporter:
              Philip Zeyliger
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development