Avro
  1. Avro
  2. AVRO-152

Adding "doc" to record schemas, a la javadoc field and class comments.

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: java
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Signature of Schema constructors has been changed to include a "doc" parameter.

      Description

      The patch to follow adds "doc" to the understood schema language (for records and enums), and uses said doc when it generates java in SpecificCompiler.

      1. AVRO-149.patch.txt
        10 kB
        Philip Zeyliger
      2. AVRO-152.patch.txt
        32 kB
        Philip Zeyliger
      3. AVRO-152.patch.txt
        31 kB
        Philip Zeyliger
      4. AVRO-152.patch.txt
        23 kB
        Philip Zeyliger
      5. AVRO-152.patch.txt
        44 kB
        Philip Zeyliger

        Issue Links

          Activity

          Philip Zeyliger created issue -
          Hide
          Philip Zeyliger added a comment -

          If you want to follow along how this patch developed, see http://github.com/philz/avro/tree/docs

          This patch, which won't apply without AVRO-149 (though I could make it independent, if that proves to be useful; I sense that 149 will go in first, though), does the following:

          • Updates the spec to understand "doc" for records (both for the record class and its fields) and enums (only for the enum class). I haven't dealt with the enum values themselves, because they're specified as an array, so don't have a natural place to put the "doc". I could make it a parallel array, of course, but for now I've left it off. I've also not annotated protocols with "docs", though they deserve the same treatment.
          • Everywhere where Schema and Field previous just had name, there's now "name, doc". That code was largely limited to Schema.java, so it wasn't a big deal.
          • I did minor refactoring and clean-up in Schema.java, and made TestSchema.java run all of its schemas through the SpecificCompiler.
          • I changed SpecificCompiler to output to a String, separating the generation, and the file-writing. Performance isn't critical here, and this let me debug and test the actual code generation, without dealing with File IO.
          • I fixed a random bug in TestReflect, where the equals() method was wrong, and fixed a bug in TestValidatingIO, where I asserted that it is nonsense to have a fixed field of 0 bytes.
          Show
          Philip Zeyliger added a comment - If you want to follow along how this patch developed, see http://github.com/philz/avro/tree/docs This patch, which won't apply without AVRO-149 (though I could make it independent, if that proves to be useful; I sense that 149 will go in first, though), does the following: Updates the spec to understand "doc" for records (both for the record class and its fields) and enums (only for the enum class). I haven't dealt with the enum values themselves, because they're specified as an array, so don't have a natural place to put the "doc". I could make it a parallel array, of course, but for now I've left it off. I've also not annotated protocols with "docs", though they deserve the same treatment. Everywhere where Schema and Field previous just had name, there's now "name, doc". That code was largely limited to Schema.java, so it wasn't a big deal. I did minor refactoring and clean-up in Schema.java, and made TestSchema.java run all of its schemas through the SpecificCompiler. I changed SpecificCompiler to output to a String, separating the generation, and the file-writing. Performance isn't critical here, and this let me debug and test the actual code generation, without dealing with File IO. I fixed a random bug in TestReflect, where the equals() method was wrong, and fixed a bug in TestValidatingIO, where I asserted that it is nonsense to have a fixed field of 0 bytes.
          Philip Zeyliger made changes -
          Field Original Value New Value
          Attachment AVRO-149.patch.txt [ 12422319 ]
          Hide
          Doug Cutting added a comment -

          You've attached the patch for AVRO-149, not for this issue.

          Show
          Doug Cutting added a comment - You've attached the patch for AVRO-149 , not for this issue.
          Hide
          Doug Cutting added a comment -

          From your description, above, it sounds like you're doing a lot more than one thing in this patch. Is it possible to put the bugfixes and refactorings into separate issues?

          Show
          Doug Cutting added a comment - From your description, above, it sounds like you're doing a lot more than one thing in this patch. Is it possible to put the bugfixes and refactorings into separate issues?
          Hide
          Philip Zeyliger added a comment -

          Attaching the right patch this time; sorry about that.

          I could do a separate patch for TestReflect() and TestValidationIO(). I don't think the other stuff deserves it, though if you insist, I can take a stab at separating the doc support and the refactoring.

          Show
          Philip Zeyliger added a comment - Attaching the right patch this time; sorry about that. I could do a separate patch for TestReflect() and TestValidationIO(). I don't think the other stuff deserves it, though if you insist, I can take a stab at separating the doc support and the refactoring.
          Philip Zeyliger made changes -
          Attachment AVRO-152.patch.txt [ 12422401 ]
          Hide
          Doug Cutting added a comment -

          > if you insist, I can take a stab at separating the doc support and the refactoring [ ... ]

          I've not had a chance to review this yet, but patches that do many things are harder to review. Ease of review is important. It's almost always better to have more small, specific, patches than one jumbo general patch.

          Show
          Doug Cutting added a comment - > if you insist, I can take a stab at separating the doc support and the refactoring [ ... ] I've not had a chance to review this yet, but patches that do many things are harder to review. Ease of review is important. It's almost always better to have more small, specific, patches than one jumbo general patch.
          Hide
          Doug Cutting added a comment -

          > I've not had a chance to review this yet [ ...]

          In concrete terms: if this were multiple, small patches, some of them might be reviewed and committed already. As it stands, I don't want to dive into this one until Monday.

          Show
          Doug Cutting added a comment - > I've not had a chance to review this yet [ ...] In concrete terms: if this were multiple, small patches, some of them might be reviewed and committed already. As it stands, I don't want to dive into this one until Monday.
          Hide
          Philip Zeyliger added a comment -

          I certainly agree in general. Annoyingly, as often happens, the actual process was (do first steps), (realize need to refactor something), (do the rest) was not commutative, and I didn't stop myself in time to re-order the steps as they happened.

          Give it a shot for 10-20 minutes on Monday. If it doesn't make sense, I'll find some other way to present it that does. If it helps to see the steps I took explicitly, you can look at the git patches below. Unfortunately, since they're not really independent, individually I can't file a JIRA for one of them, since they'd either not apply or be works-in-progress that don't make sense.

          http://github.com/philz/avro/commit/b00e1a3bf4d1afc4244ccec15168b23229e2033e
          http://github.com/philz/avro/commit/97db7d7b8ecc40c2ff112411247f6db14260a98f
          http://github.com/philz/avro/commit/11b5d4c9caf68544de1027d3ed9da5baa2b21837
          http://github.com/philz/avro/commit/4d8c5bace37ef843fbfde2c8d63d35419f4125a8

          Show
          Philip Zeyliger added a comment - I certainly agree in general. Annoyingly, as often happens, the actual process was (do first steps), (realize need to refactor something), (do the rest) was not commutative, and I didn't stop myself in time to re-order the steps as they happened. Give it a shot for 10-20 minutes on Monday. If it doesn't make sense, I'll find some other way to present it that does. If it helps to see the steps I took explicitly, you can look at the git patches below. Unfortunately, since they're not really independent, individually I can't file a JIRA for one of them, since they'd either not apply or be works-in-progress that don't make sense. http://github.com/philz/avro/commit/b00e1a3bf4d1afc4244ccec15168b23229e2033e http://github.com/philz/avro/commit/97db7d7b8ecc40c2ff112411247f6db14260a98f http://github.com/philz/avro/commit/11b5d4c9caf68544de1027d3ed9da5baa2b21837 http://github.com/philz/avro/commit/4d8c5bace37ef843fbfde2c8d63d35419f4125a8
          Hide
          Doug Cutting added a comment -

          If, after reviewing, I think they're separable, independent changes, and you're unwilling to split them into separate issues, then I'll split them myself. But I'd rather you did it first!

          And, again, I'm not interested in looking at github anymore than I am in using Eclipse. If you want me to look at something, please attach a patch to an issue in Jira. Thanks!

          Show
          Doug Cutting added a comment - If, after reviewing, I think they're separable, independent changes, and you're unwilling to split them into separate issues, then I'll split them myself. But I'd rather you did it first! And, again, I'm not interested in looking at github anymore than I am in using Eclipse. If you want me to look at something, please attach a patch to an issue in Jira. Thanks!
          Hide
          Doug Cutting added a comment -

          Some comments on the patch:

          • You added a "never returns null" to Schema#getName(). This is not relevant to this issue, and it's not true. The record schema used to represent a message's parameter list is anonymous.
          • You've also made style changes (e.g., adding braces) in code unrelated to this issue. Can you please remove these?
          • Do you intend the addition of new parameters to public Schema methods to be an incompatible change? I'm okay with this, but it must be called out, so that we're sure to list it as an incompatible change in the next release.
          • The refactoring of SpecificCompiler does not seem required by this issue. It does perhaps simplify TestSpecificCompiler, but that also could be written to instead operate on files. In either case, I would prefer this were addressed in a separate issue.
          Show
          Doug Cutting added a comment - Some comments on the patch: You added a "never returns null" to Schema#getName(). This is not relevant to this issue, and it's not true. The record schema used to represent a message's parameter list is anonymous. You've also made style changes (e.g., adding braces) in code unrelated to this issue. Can you please remove these? Do you intend the addition of new parameters to public Schema methods to be an incompatible change? I'm okay with this, but it must be called out, so that we're sure to list it as an incompatible change in the next release. The refactoring of SpecificCompiler does not seem required by this issue. It does perhaps simplify TestSpecificCompiler, but that also could be written to instead operate on files. In either case, I would prefer this were addressed in a separate issue.
          Hide
          Philip Zeyliger added a comment -

          Thanks for the feedback. As you've no doubt noticed, I've started a handful of JIRAs for the stuff that you feel is unrelated. Once those settle down, I'll ressurect this one.

          You added a "never returns null" to Schema#getName(). This is not relevant to this issue, and it's not true. The record schema used to represent a message's parameter list is anonymous.

          Ah, I see. Schema.createRecord() is called by Protocol.parse().

          Show
          Philip Zeyliger added a comment - Thanks for the feedback. As you've no doubt noticed, I've started a handful of JIRAs for the stuff that you feel is unrelated. Once those settle down, I'll ressurect this one. You added a "never returns null" to Schema#getName(). This is not relevant to this issue, and it's not true. The record schema used to represent a message's parameter list is anonymous. Ah, I see. Schema.createRecord() is called by Protocol.parse().
          Hide
          Philip Zeyliger added a comment -

          AVRO-165, AVRO-166, and AVRO-167 were pulled out into their own issues from the previous patches here. The patch attached does what this ticket originally intended — adds the notion of "doc" to schemas.

          Show
          Philip Zeyliger added a comment - AVRO-165 , AVRO-166 , and AVRO-167 were pulled out into their own issues from the previous patches here. The patch attached does what this ticket originally intended — adds the notion of "doc" to schemas.
          Philip Zeyliger made changes -
          Attachment AVRO-152.patch.txt [ 12423418 ]
          Philip Zeyliger made changes -
          Release Note Signature of Schema constructors has been changed to include a "doc" parameter.
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hadoop Flags [Incompatible change]
          Hide
          Doug Cutting added a comment -

          We should also add doc to protocols and their messages. Note that this will require that the specific compiler generate a per-interface constant with the protocol definition, which is probably a good idea anyway. (Currently we use reflection to re-construct specific protocols, which is fragile.)

          Show
          Doug Cutting added a comment - We should also add doc to protocols and their messages. Note that this will require that the specific compiler generate a per-interface constant with the protocol definition, which is probably a good idea anyway. (Currently we use reflection to re-construct specific protocols, which is fragile.)
          Philip Zeyliger made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Philip Zeyliger added a comment -

          Finally got a chance to re-visit this. Attaching a new patch; it applies against trunk.

          I added "doc" messages to protocols.

          Note that this will require that the specific compiler generate a per-interface constant with the protocol definition, which is probably a good idea anyway. (Currently we use reflection to re-construct specific protocols, which is fragile.)

          I'm not quite sure what you mean here. Docs aren't being generated when a schema is retrieved via reflection.

          I haven't added per-parameter docs (akin to "@param" in javadoc) yet. That's a natural next thing to do, though. I'd like to get this in with the current functionality, though.

          – Philip

          Show
          Philip Zeyliger added a comment - Finally got a chance to re-visit this. Attaching a new patch; it applies against trunk. I added "doc" messages to protocols. Note that this will require that the specific compiler generate a per-interface constant with the protocol definition, which is probably a good idea anyway. (Currently we use reflection to re-construct specific protocols, which is fragile.) I'm not quite sure what you mean here. Docs aren't being generated when a schema is retrieved via reflection. I haven't added per-parameter docs (akin to "@param" in javadoc) yet. That's a natural next thing to do, though. I'd like to get this in with the current functionality, though. – Philip
          Philip Zeyliger made changes -
          Attachment AVRO-152.patch.txt [ 12429088 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Doug Cutting added a comment -

          This looks good. A few minor issues:

          • the patch is stale, since I took so long to review it. Sorry!
          • why all the SpecificCompiler#doc() methods? Why not just call, e.g., doc(out, indent, schema.getDoc())?
          • shouldn't we now also extend avrogen to copy comments through?
          Show
          Doug Cutting added a comment - This looks good. A few minor issues: the patch is stale, since I took so long to review it. Sorry! why all the SpecificCompiler#doc() methods? Why not just call, e.g., doc(out, indent, schema.getDoc())? shouldn't we now also extend avrogen to copy comments through?
          Doug Cutting made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Philip Zeyliger added a comment -

          the patch is stale, since I took so long to review it. Sorry!

          No worries. I've updated the patch.

          It's always strangely pleasant when patches go stale because there are new features to adapt to.

          why all the SpecificCompiler#doc() methods? Why not just call, e.g., doc(out, indent, schema.getDoc())?

          It was a good idea when there were only 2 types, and it felt cleaner. Fixed it to just use String and one doc() method. Avoided the temptation to introduce a "hasDoc" interface.

          shouldn't we now also extend avrogen to copy comments through?

          I've filed AVRO-296. It needs to be done, but it's not 100% trivial (because javacc strips out comments, and that needs to get undone), so I'd rather take a pass at it separately.

          Show
          Philip Zeyliger added a comment - the patch is stale, since I took so long to review it. Sorry! No worries. I've updated the patch. It's always strangely pleasant when patches go stale because there are new features to adapt to. why all the SpecificCompiler#doc() methods? Why not just call, e.g., doc(out, indent, schema.getDoc())? It was a good idea when there were only 2 types, and it felt cleaner. Fixed it to just use String and one doc() method. Avoided the temptation to introduce a "hasDoc" interface. shouldn't we now also extend avrogen to copy comments through? I've filed AVRO-296 . It needs to be done, but it's not 100% trivial (because javacc strips out comments, and that needs to get undone), so I'd rather take a pass at it separately.
          Philip Zeyliger made changes -
          Attachment AVRO-152.patch.txt [ 12429707 ]
          Philip Zeyliger made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Doug Cutting added a comment -

          I just committed this. Thanks, Philip!

          Show
          Doug Cutting added a comment - I just committed this. Thanks, Philip!
          Doug Cutting made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Fix Version/s 1.3.0 [ 12314318 ]
          Resolution Fixed [ 1 ]
          Jeff Hammerbacher made changes -
          Link This issue relates to AVRO-300 [ AVRO-300 ]
          Doug Cutting made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Doug Cutting made changes -
          Component/s java [ 12312780 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available Open Open
          71d 12h 54m 2 Doug Cutting 07/Jan/10 20:14
          Open Open Patch Available Patch Available
          12d 10h 46m 3 Philip Zeyliger 08/Jan/10 02:40
          Patch Available Patch Available Resolved Resolved
          17h 9m 1 Doug Cutting 08/Jan/10 19:50
          Resolved Resolved Closed Closed
          51d 21h 18m 1 Doug Cutting 01/Mar/10 17:09

            People

            • Assignee:
              Philip Zeyliger
              Reporter:
              Philip Zeyliger
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development