Avro
  1. Avro
  2. AVRO-151

Validating Avro schema parser in C

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: c
    • Labels:
      None

      Description

      This Jira is for tracking commit r825593 "Added a validating Avro schema parser in C"

      This commit adds a schema parser which validates the schema while generating an intermediate tree of value objects for printing, reading, skipping and writing avro values. You'll notice that the C code is written in an object-oriented fashion with clear interfaces, data encapsulation, etc. My primary focus at this time is to make the core source as easy to understand and work on as possible to make it easier to add new developers moving forward. I plan to add a developers guide to src/c soon to help toward that end as well. I've separated the public API into avro.h and the private API into avro_private.h. You'll notice that currently there is no public API. Once the internals are built correctly (with better error reporting), sketching out the public API will be easier and can focus on usability and flexibility for consumers of Avro in C.

      I'm very near to having Avro file object containers working and you can see the code for that in the source (avro_file_container.c). It correctly parses the interop.avsc and reads the non-recursive elements. My only blocker for file containers now is support for reading recursive schemas (I'm currently using a decorator when I should have use a factory) and the fact that a few complex types don't have reading/writing support. I expect file object container support to be finished soon.

        Activity

        Hide
        Matt Massie added a comment -

        My apologies for having two commit to push this code. The second commit was required because I failed to update CHANGES.txt.

        ------------------------------------------------------------------------
        r825702 | massie | 2009-10-15 16:41:26 -0700 (Thu, 15 Oct 2009) | 2 lines

        AVRO-151. Validating Avro schema parser for C (code committed r825593)

        ------------------------------------------------------------------------
        r825593 | massie | 2009-10-15 11:00:21 -0700 (Thu, 15 Oct 2009) | 3 lines

        Added a validating Avro schema parser in C

        Show
        Matt Massie added a comment - My apologies for having two commit to push this code. The second commit was required because I failed to update CHANGES.txt. ------------------------------------------------------------------------ r825702 | massie | 2009-10-15 16:41:26 -0700 (Thu, 15 Oct 2009) | 2 lines AVRO-151 . Validating Avro schema parser for C (code committed r825593) ------------------------------------------------------------------------ r825593 | massie | 2009-10-15 11:00:21 -0700 (Thu, 15 Oct 2009) | 3 lines Added a validating Avro schema parser in C
        Hide
        Thiruvalluvan M. G. added a comment -

        It appears that you forgot to check-in src/c/configure. Or am I missing something?

        Show
        Thiruvalluvan M. G. added a comment - It appears that you forgot to check-in src/c/configure. Or am I missing something?
        Hide
        Matt Massie added a comment -

        $ ant compile-c

        takes care of generating the configure script. You can also just run
        `autoreconf -f -i` in the src/c directory to create it.

        -Matt

        On Thu, Oct 15, 2009 at 8:12 PM, Thiruvalluvan M. G. (JIRA) <jira@apache.org

        Show
        Matt Massie added a comment - $ ant compile-c takes care of generating the configure script. You can also just run `autoreconf -f -i` in the src/c directory to create it. -Matt On Thu, Oct 15, 2009 at 8:12 PM, Thiruvalluvan M. G. (JIRA) <jira@apache.org
        Hide
        Patrick Hunt added a comment -

        Matt, Re Thiru's comment, also a lesson learned in zookeeper, we found that not distributing configure with the release
        was a problem in some cases. However checking into svn is not optimal either. An alternative that seems to be
        working well is to; 1) don't checking configure to svn, just the configure.in and other auto* src files, 2) generate
        configure as part of the package target in build.xml and include the generated files in the release.

        When researching for ZK I found the following to give some very good insight, see the section "great idea in theory":
        http://www.redhat.com/magazine/012oct05/features/autotools/
        "Unfortunately, due to the fact that the original software developer probably uses older versions of the autotools than the RPM packager, this does not always do the right thing."

        Show
        Patrick Hunt added a comment - Matt, Re Thiru's comment, also a lesson learned in zookeeper, we found that not distributing configure with the release was a problem in some cases. However checking into svn is not optimal either. An alternative that seems to be working well is to; 1) don't checking configure to svn, just the configure.in and other auto* src files, 2) generate configure as part of the package target in build.xml and include the generated files in the release. When researching for ZK I found the following to give some very good insight, see the section "great idea in theory": http://www.redhat.com/magazine/012oct05/features/autotools/ "Unfortunately, due to the fact that the original software developer probably uses older versions of the autotools than the RPM packager, this does not always do the right thing."
        Hide
        Matt Massie added a comment -

        Patrick-

        My current approach is to checkin the configure.in and Makefile.am files. In build.xml, there is an uptodate target that checks for a configure script and if it's not there runs `autoreconf -f -i`. In the near future, I plan to use the 'distdir' target in autotools to create a C source directory for distribution. This will ensure that consumers of Avro C code don't need to have autotools installed to build it.

        For pure autotools projects, a simple `make distcheck` runs unit tests for the package and, if they pass, creates a tarball ready for distribution. I'll try to mimic that behavior in Avro as much as possible although bolting an autotools project to an ant project isn't optimal.

        You can expect a new 'package-c' target in build.xml from me soon. Thanks for the feedback and link.

        -Matt

        Show
        Matt Massie added a comment - Patrick- My current approach is to checkin the configure.in and Makefile.am files. In build.xml, there is an uptodate target that checks for a configure script and if it's not there runs `autoreconf -f -i`. In the near future, I plan to use the 'distdir' target in autotools to create a C source directory for distribution. This will ensure that consumers of Avro C code don't need to have autotools installed to build it. For pure autotools projects, a simple `make distcheck` runs unit tests for the package and, if they pass, creates a tarball ready for distribution. I'll try to mimic that behavior in Avro as much as possible although bolting an autotools project to an ant project isn't optimal. You can expect a new 'package-c' target in build.xml from me soon. Thanks for the feedback and link. -Matt
        Hide
        Patrick Hunt added a comment -

        Sounds good, I think your point "This will ensure that consumers of Avro C code don't need to have autotools installed to build it." is
        part of the issue we saw in zk and the key point here - the std use case for end users is to have a configure such that they can
        "./configure; make" and off to the races.

        Show
        Patrick Hunt added a comment - Sounds good, I think your point "This will ensure that consumers of Avro C code don't need to have autotools installed to build it." is part of the issue we saw in zk and the key point here - the std use case for end users is to have a configure such that they can "./configure; make" and off to the races.
        Hide
        Doug Cutting added a comment -

        For future reference, the primary commit for this issue was revision 825593.

        http://svn.apache.org/viewvc?rev=825593&view=rev

        Matt, in the future, remember to not only update CHANGES.txt, but also to name the Jira issue in each commit message. That creates links from Jira to the commit, which can be very convenient.

        The C code is still in an early state, and does not yet require reviews before each commit. But it's still best practice to file an issue, post the patch, and then wait a day or so for feedback before committing. As the C code becomes complete and others start to use and contribute to it, this will evolve to requiring another committer to +1 each patch before it commits. We're willing to streamline that now, but we still want a Jira issue per patch and an opportunity for folks to remark on things before they're committed. Thanks!

        Show
        Doug Cutting added a comment - For future reference, the primary commit for this issue was revision 825593. http://svn.apache.org/viewvc?rev=825593&view=rev Matt, in the future, remember to not only update CHANGES.txt, but also to name the Jira issue in each commit message. That creates links from Jira to the commit, which can be very convenient. The C code is still in an early state, and does not yet require reviews before each commit. But it's still best practice to file an issue, post the patch, and then wait a day or so for feedback before committing. As the C code becomes complete and others start to use and contribute to it, this will evolve to requiring another committer to +1 each patch before it commits. We're willing to streamline that now, but we still want a Jira issue per patch and an opportunity for folks to remark on things before they're committed. Thanks!

          People

          • Assignee:
            Matt Massie
            Reporter:
            Matt Massie
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development