Avro
  1. Avro
  2. AVRO-778

C: avrocc schema-specific compiler

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: c
    • Labels:
      None

      Description

      I've started work on a pure-C compiler for generating C types that are specific to an Avro schema. This resurrects the avro_schema_specific function that's been lying around in the header file for awhile. I'm only focusing on the data serialization parts of the spec; at least for now, I'm not including any of the RPC stuff.

      One of my main goals with this is for the compiler to be in pure C; I don't want to have to install the Java language bindings in order to run the schema compiler. This means that I can't use the Velocity engine to produce the output .c and .h files. Instead of trying to find a good C templating engine, I'm just using the trusty C preprocessor. It's not the prettiest thing in the world, but it's perfectly capable.

      This issue will track my progress on this; I'll upload patches here, as well as maintain a branch at github.

      1. 0019-Resolver-class-for-schema-specific-writer-unions.patch
        32 kB
        Douglas Creager
      2. 0018-Resolver-class-for-schema-specific-reader-unions.patch
        40 kB
        Douglas Creager
      3. 0017-Schema-specific-schemas.patch
        13 kB
        Douglas Creager
      4. 0016-Schema-specific-resolvers-for-non-union-types.patch
        35 kB
        Douglas Creager
      5. 0015-Equality-functions-for-schema-specific-types.patch
        33 kB
        Douglas Creager
      6. 0014-Raw-resolver-classes.patch
        16 kB
        Douglas Creager
      7. 0013-Raw-value-comparison-operations.patch
        25 kB
        Douglas Creager
      8. 0012-Error-message-API-is-now-public.patch
        12 kB
        Douglas Creager
      9. 0011-Schema-specific-array-and-map-accessors.patch
        37 kB
        Douglas Creager
      10. 0010-Public-memoization-API.patch
        25 kB
        Douglas Creager
      11. 0009-Child-consumers-in-base-consumer-class.patch
        21 kB
        Douglas Creager
      12. 0008-Growing-maps-and-arrays-without-adding-elements.patch
        2 kB
        Douglas Creager
      13. 0007-Clearing-arrays-and-maps.patch
        5 kB
        Douglas Creager
      14. 0006-Schema-specific-union-functions.patch
        15 kB
        Douglas Creager
      15. 0005-Schema-specific-structs.patch
        67 kB
        Douglas Creager
      16. 0004-Custom-allocation-API-is-now-public.patch
        11 kB
        Douglas Creager
      17. 0003-Resizable-string-buffer-class.patch
        10 kB
        Douglas Creager
      18. 0002-Resizable-map-class.patch
        8 kB
        Douglas Creager
      19. 0001-Resizable-array-class.patch
        12 kB
        Douglas Creager

        Activity

        Douglas Creager created issue -
        Hide
        Douglas Creager added a comment -

        Incidentally, I'll need to check AVRO-777 to see if there's any overlap, and how hard it would be to merge the two efforts.

        Show
        Douglas Creager added a comment - Incidentally, I'll need to check AVRO-777 to see if there's any overlap, and how hard it would be to merge the two efforts.
        Hide
        Douglas Creager added a comment -

        Current status:

        avrocc compiler is built and installed into $PREFIX/bin, in both autotools and CMake builds. It produces a C type for each compound type found in an .avsc file. There are “lifecycle” methods for initializing and finalizing these schema-specific types, and well as accessor methods for arrays, maps, and unions. (Record fields turn into C struct fields, and don't need accessors.)

        Next main task is to implement avro_resolver_t equivalents for each schema, which will allow us to parse Avro data directly into the schema-specific C types.

        Show
        Douglas Creager added a comment - Current status: avrocc compiler is built and installed into $PREFIX/bin, in both autotools and CMake builds. It produces a C type for each compound type found in an .avsc file. There are “lifecycle” methods for initializing and finalizing these schema-specific types, and well as accessor methods for arrays, maps, and unions. (Record fields turn into C struct fields, and don't need accessors.) Next main task is to implement avro_resolver_t equivalents for each schema, which will allow us to parse Avro data directly into the schema-specific C types.
        Douglas Creager made changes -
        Field Original Value New Value
        Attachment 0001-Resizable-array-class.patch [ 12473139 ]
        Attachment 0002-Resizable-map-class.patch [ 12473140 ]
        Attachment 0003-Resizable-string-buffer-class.patch [ 12473141 ]
        Attachment 0004-Custom-allocation-API-is-now-public.patch [ 12473142 ]
        Attachment 0005-Schema-specific-structs.patch [ 12473143 ]
        Attachment 0006-Schema-specific-union-functions.patch [ 12473144 ]
        Attachment 0007-Clearing-arrays-and-maps.patch [ 12473145 ]
        Attachment 0008-Growing-maps-and-arrays-without-adding-elements.patch [ 12473146 ]
        Attachment 0009-Child-consumers-in-base-consumer-class.patch [ 12473147 ]
        Attachment 0010-Public-memoization-API.patch [ 12473148 ]
        Attachment 0011-Schema-specific-array-and-map-accessors.patch [ 12473149 ]
        Douglas Creager made changes -
        Component/s c [ 12312857 ]
        Hide
        Douglas Creager added a comment -

        Adding latest patches from github branch.

        Current status:

        This is basically done, though I need to put together some better documentation about which types and functions are generated for each schema. There is now a schema-specific resolver implementation for each compound schema in the AVSC file. The resolvers correctly handle reader unions and writer unions. Remaining features left to implement: primitive value promotion (int->long etc.), and a function for outputting an Avro binary encoding.

        Show
        Douglas Creager added a comment - Adding latest patches from github branch. Current status: This is basically done, though I need to put together some better documentation about which types and functions are generated for each schema. There is now a schema-specific resolver implementation for each compound schema in the AVSC file. The resolvers correctly handle reader unions and writer unions. Remaining features left to implement: primitive value promotion (int->long etc.), and a function for outputting an Avro binary encoding.

          People

          • Assignee:
            Douglas Creager
            Reporter:
            Douglas Creager
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development