Qpid Proton
  1. Qpid Proton
  2. PROTON-57

Proton porting problems between current codebase and Visual Studio 2010 toolset

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4
    • Component/s: proton-c
    • Labels:
    • Environment:
      Windows using Visual Studio 2010

      Description

      This thread will be used to discuss the porting problems encountered using Visual Studio 2010

      Here’s the first one to discuss:

      1. Visual Studio doesn’t support variable length arrays.
      a. Currently using malloc()/realloc() in my port just to get it to compile and be able to report memory allocation errors. This is not what I want to submit to the proton group for memory allocation.

      b. Cliff had a good method that included setting up macros and replace the VLAs with alloca() in the Windows version, but it could still cause problems when the stack overflowed. VLAs can also run out of stack space.

      c. _malloca() should be a better way than _alloca() on Visual Studio. Any messages under 1K would be allocated out of the stack. Over 1K will use heap. If the average messages are under 1K, this may be the most efficient way in Visual Studio. _malloca() also has new security enhancements.
      1. Using _malloca() in the Windows version and VLA in Linux would require two macros. The major difference for the Linux version would be to use the new macro for VLA and to include the new free macro even though there is nothing to free using VLA. In Visual Studio, _freea(buf) will not free anything if it is allocating off the stack.
      Linux can continue to use VLAs.

      #ifdef C99
      #define PN_VLA(TYPE, buf, size) TYPE buf[size]
      #define PN_VLA_FREE
      #else
      #define PN_VLA(TYPE, buf, size) TYPE buf = (TYPE) _malloca(size)
      #define PN_VLA_FREE(buf) _freea(buf)
      #endif

      d. If the average size messages to allocate out of the stack needs to be increased for performance reasons, we can set up a new memory model. The 1K is not adjustable for _malloca().

      We can set up new macros along the lines of Microsoft’s suggestion below.
      “I would suggest something similar to what ATL does. Use alloca for small (where you define what that is) sizes and use heap allocation for larger ones. You can wrap the logic inside of a class or macros. To work around the fact that alloca isn't cleaned up at block scope, rewrite the block into functions or use lambdas. (I think alloca works inside of lambdas.)”

      1. messenger.c.patch
        2 kB
        Mary hinton
      2. codec.c.patch
        4 kB
        Mary hinton

        Activity

        Hide
        Cliff Jansen added a comment -

        fixed in PROTON-159

        Show
        Cliff Jansen added a comment - fixed in PROTON-159
        Hide
        Mary hinton added a comment -

        Added patches for removing VLAs from code.

        Show
        Mary hinton added a comment - Added patches for removing VLAs from code.
        Hide
        Mary hinton added a comment -

        For the ones you want a fixed maximum length, is there a specific size for printing the atoms and data that you want? The same with the maximum length for the addresses, what size would you like to specify?

        “pn_data_encode/decode(...) For these we should add a dedicated array to pn_data_t and grow the array as needed.”

        pn_data_encode() used a VLA for allocating atoms on the stack. Did you want me to put in a pointer to atoms in pn_data_t and then allocate it out of heap and if needed realloc() and free it in pn_data_free()?

        Show
        Mary hinton added a comment - For the ones you want a fixed maximum length, is there a specific size for printing the atoms and data that you want? The same with the maximum length for the addresses, what size would you like to specify? “pn_data_encode/decode(...) For these we should add a dedicated array to pn_data_t and grow the array as needed.” pn_data_encode() used a VLA for allocating atoms on the stack. Did you want me to put in a pointer to atoms in pn_data_t and then allocate it out of heap and if needed realloc() and free it in pn_data_free()?
        Hide
        Rafael H. Schloming added a comment -

        Thanks for generating the list, I think this is quite manageable. Here's my take on what we should do in each case:

        proton.c:
        This is a very outdated example that should probably be moved and/or renamed and/or updated/replaced or possibly just deleted. The use of VLAs here is just for creating message data, if this is done through the message API then we should be able to eliminate the VLAs whilst simultaneously making the example a tad more useful.

        codec.c:
        pn_print_atoms(...) This is only used for debug printing, so I think it's entirely unnecessary to allow for an arbitrarily large array. We can just use a fixed maximum, so long as we handle the overflow case so that the string is a) properly terminated, and b) indicates to the user that the debug output has been truncated in some way.
        pn_data_encode/decode(...) For these we should add a dedicated array to pn_data_t and grow the array as needed.

        messenger.c:
        All of these are simply creating scratch buffers for string parsing and manipulation of addresses. I believe it would be reasonable to either A) impose a fixed maximum length on addresses, or B) add a dedicated scratch buffer to pn_messenger_t to hold the address for parsing.

        sasl.c:
        I'll volunteer to eliminate this one. (It's easier to do than to describe.)

        pn_fprint_data(...):
        Again this is only for debug output, so having a fixed maximum should be fine as long as we make it obvious that the displayed output is truncated.

        Show
        Rafael H. Schloming added a comment - Thanks for generating the list, I think this is quite manageable. Here's my take on what we should do in each case: proton.c: This is a very outdated example that should probably be moved and/or renamed and/or updated/replaced or possibly just deleted. The use of VLAs here is just for creating message data, if this is done through the message API then we should be able to eliminate the VLAs whilst simultaneously making the example a tad more useful. codec.c: pn_print_atoms(...) This is only used for debug printing, so I think it's entirely unnecessary to allow for an arbitrarily large array. We can just use a fixed maximum, so long as we handle the overflow case so that the string is a) properly terminated, and b) indicates to the user that the debug output has been truncated in some way. pn_data_encode/decode(...) For these we should add a dedicated array to pn_data_t and grow the array as needed. messenger.c: All of these are simply creating scratch buffers for string parsing and manipulation of addresses. I believe it would be reasonable to either A) impose a fixed maximum length on addresses, or B) add a dedicated scratch buffer to pn_messenger_t to hold the address for parsing. sasl.c: I'll volunteer to eliminate this one. (It's easier to do than to describe.) pn_fprint_data(...): Again this is only for debug output, so having a fixed maximum should be fine as long as we make it obvious that the displayed output is truncated.
        Hide
        Mary hinton added a comment -

        proton.c
        server_callback(pn_connector_t *ctor) – char iresp[n];
        – char data[ctx->size + 16];
        client_callback(pn_connector_t *ctor) – char msg[ctx->size]
        – char data[ctx->size + 16];

        codec.c
        pn_print_atoms(const pn_atoms_t * atoms) – char buf[size]
        pn_data_encode(pn_data_t *data, char *bytes, size_t size) – pn_atom_t atoms[data->size + data->extras];
        pn_data_decode(pn_data_t *data, char *bytes, size_t size) – pn_atom_t atoms[asize];

        messenger.c
        pn_messenger_resolve(pn_messenger_t *messenger, char *address, char **name) – char domain[strlen(address) + 1];
        pn_messenger_link(pn_messenger_t *messenger, const char *address, bool sender) – char copy[(address ? strlen(address) : 0) + 1];
        pn_messenger_subscribe(pn_messenger_t *messenger, const char *source) – char copy[strlen(source) + 1];
        outward_munge(pn_messenger_t *mng, pn_message_t *msg) – char buf[len + strlen(mng->name) + 4];
        – char buf[strlen(mng->name) + 4];
        pn_messenger_put(pn_messenger_t *messenger, pn_message_t *msg) – char encoded[size];

        sasl.c
        pn_sasl_plain(pn_sasl_t *sasl, const char *username, const char *password) – char iresp[size];
        pn_fprint_data(FILE *stream, const char *bytes, size_t size) – char buf[capacity];

        This list only includes the VLAs and doesn't include constant size arrays.

        Show
        Mary hinton added a comment - proton.c server_callback(pn_connector_t *ctor) – char iresp [n] ; – char data [ctx->size + 16] ; client_callback(pn_connector_t *ctor) – char msg [ctx->size] – char data [ctx->size + 16] ; codec.c pn_print_atoms(const pn_atoms_t * atoms) – char buf [size] pn_data_encode(pn_data_t *data, char *bytes, size_t size) – pn_atom_t atoms [data->size + data->extras] ; pn_data_decode(pn_data_t *data, char *bytes, size_t size) – pn_atom_t atoms [asize] ; messenger.c pn_messenger_resolve(pn_messenger_t *messenger, char *address, char **name) – char domain [strlen(address) + 1] ; pn_messenger_link(pn_messenger_t *messenger, const char *address, bool sender) – char copy [(address ? strlen(address) : 0) + 1] ; pn_messenger_subscribe(pn_messenger_t *messenger, const char *source) – char copy [strlen(source) + 1] ; outward_munge(pn_messenger_t *mng, pn_message_t *msg) – char buf [len + strlen(mng->name) + 4] ; – char buf [strlen(mng->name) + 4] ; pn_messenger_put(pn_messenger_t *messenger, pn_message_t *msg) – char encoded [size] ; sasl.c pn_sasl_plain(pn_sasl_t *sasl, const char *username, const char *password) – char iresp [size] ; pn_fprint_data(FILE *stream, const char *bytes, size_t size) – char buf [capacity] ; This list only includes the VLAs and doesn't include constant size arrays.
        Hide
        Rafael H. Schloming added a comment -

        Hi Mary,

        Would it be possible for you to generate a listing of where the VLA issues are? If there aren't too many of them I'd like to just eliminate them on a case by case basis and avoid all stack allocation entirely.

        Show
        Rafael H. Schloming added a comment - Hi Mary, Would it be possible for you to generate a listing of where the VLA issues are? If there aren't too many of them I'd like to just eliminate them on a case by case basis and avoid all stack allocation entirely.
        Hide
        Mary hinton added a comment -

        I forgot to put the sizeof(TYPE) for Windows in the Macro for Windows port

        #ifdef C99
        #define PN_VLA(TYPE, buf, size) TYPE buf[size]
        #define PN_VLA_FREE
        #else
        #define PN_VLA(TYPE, buf, size) TYPE buf = (TYPE) _malloca((size) * sizeof(TYPE))
        #define PN_VLA_FREE(buf) _freea(buf)
        #endif

        Show
        Mary hinton added a comment - I forgot to put the sizeof(TYPE) for Windows in the Macro for Windows port #ifdef C99 #define PN_VLA(TYPE, buf, size) TYPE buf [size] #define PN_VLA_FREE #else #define PN_VLA(TYPE, buf, size) TYPE buf = (TYPE ) _malloca((size) * sizeof(TYPE)) #define PN_VLA_FREE(buf) _freea(buf) #endif
        Hide
        Mary hinton added a comment -

        I’m going to list the problems with the biggest hits on the code first.

        2. Initialization with braces is not supported by Visual Studio. I used ifdef(s) in the codebase to keep the current code working for Linux, but changed quite a few of these to get it to compile in Visual Studio.

        Eample1

        ssize_t pn_data_encode(pn_data_t *data, char *bytes, size_t size)
        {
        #ifndef _WINDOWS
        pn_atom_t atoms[data->size + data->extras];
        pn_atoms_t latoms =

        {.size=data->size + data->extras, .start=atoms}

        ;
        #else
        PN_VLA(pn_atom_t, atoms, data->size + data->extras);
        pn_atoms_t latoms;
        latoms.size = data->size + data->extras;
        latoms.start = atoms;
        #endif

        pn_data_as_atoms(data, &latoms);
        pn_bytes_t lbytes = pn_bytes(size, bytes);

        int err = pn_encode_atoms(&lbytes, &latoms);

        PN_VLA_FREE(atoms);

        if (err) return err;
        return lbytes.size;
        }

        --------------------
        Example 2

        #ifndef _WINDOWS
        return (pn_bytes_t)

        {size, start}

        ;
        #else
        pn_bytes_t pnBytes;
        pnBytes.size = size;
        pnBytes.start= start;
        return pnBytes;
        #endif

        --------------------
        Example 3

        pn_do_transfer()

        #ifndef _WINDOWS
        delivery = pn_delivery(link, pn_dtag(tag.start, tag.size));
        #else
        pn_delivery_tag_t delivt;
        delivt.bytes = tag.start;
        delivt.size = tag.size;
        delivery = pn_delivery(link, delivt);
        #endif

        Show
        Mary hinton added a comment - I’m going to list the problems with the biggest hits on the code first. 2. Initialization with braces is not supported by Visual Studio. I used ifdef(s) in the codebase to keep the current code working for Linux, but changed quite a few of these to get it to compile in Visual Studio. Eample1 ssize_t pn_data_encode(pn_data_t *data, char *bytes, size_t size) { #ifndef _WINDOWS pn_atom_t atoms [data->size + data->extras] ; pn_atoms_t latoms = {.size=data->size + data->extras, .start=atoms} ; #else PN_VLA(pn_atom_t, atoms, data->size + data->extras); pn_atoms_t latoms; latoms.size = data->size + data->extras; latoms.start = atoms; #endif pn_data_as_atoms(data, &latoms); pn_bytes_t lbytes = pn_bytes(size, bytes); int err = pn_encode_atoms(&lbytes, &latoms); PN_VLA_FREE(atoms); if (err) return err; return lbytes.size; } -------------------- Example 2 #ifndef _WINDOWS return (pn_bytes_t) {size, start} ; #else pn_bytes_t pnBytes; pnBytes.size = size; pnBytes.start= start; return pnBytes; #endif -------------------- Example 3 pn_do_transfer() #ifndef _WINDOWS delivery = pn_delivery(link, pn_dtag(tag.start, tag.size)); #else pn_delivery_tag_t delivt; delivt.bytes = tag.start; delivt.size = tag.size; delivery = pn_delivery(link, delivt); #endif
        Hide
        Rafael H. Schloming added a comment -

        So I think most of the uses of VLAs in the current codebase are due to expediency/laziness on my part and should be eliminated in favor of a dedicated buffer held by an appropriate object. I think I'm pretty comfortable overall saying that we should get to a place where there is no variable stack allocation in the code whatsoever (by alloca or any other means).

        On the general topic I think one of the key things we need to do is ensure that whatever patterns/practices are causing problems with C++ are addressed in a way that we don't keep reintroducing more issues. It's fine to take a snapshot and get it to work, but if upstream practices keep reintroducing more problems then there will be continual catchup. I think one of the important things here will be to have some kind of CI build so that those of us who aren't using visual studio as our primary dev environment get early feedback if we accidentally break something.

        Show
        Rafael H. Schloming added a comment - So I think most of the uses of VLAs in the current codebase are due to expediency/laziness on my part and should be eliminated in favor of a dedicated buffer held by an appropriate object. I think I'm pretty comfortable overall saying that we should get to a place where there is no variable stack allocation in the code whatsoever (by alloca or any other means). On the general topic I think one of the key things we need to do is ensure that whatever patterns/practices are causing problems with C++ are addressed in a way that we don't keep reintroducing more issues. It's fine to take a snapshot and get it to work, but if upstream practices keep reintroducing more problems then there will be continual catchup. I think one of the important things here will be to have some kind of CI build so that those of us who aren't using visual studio as our primary dev environment get early feedback if we accidentally break something.
        Hide
        Mary hinton added a comment -

        Thanks for the feedback,

        We could substitute the alloca() for memory allocation that is not in a loop as long as the buffers are reasonable sizes.

        Since alloca() acquires new stack space each time it's called without reusing space from previous calls, it's not a good idea to use it inside a loop.

        The problem with anything being allocated out of the stack is the size of the allocation. Stack space can still run out and alloca() will not warn us of a problem. Using a loop in Linux containing a VLA to reallocate stack space, you can still get segmentation failure if the allocation is large enough. Also if large messages have been allocated on the stack, the memory can be reused, but my understanding is that the stack doesn't shrink. The only way to recover stack memory is when the thread exits.

        I notice proton is initially set up with a stack space size of 10000000, so it may be expecting lots of large messages. I didn't know how many threads are being created to handle buffers allocated out of the stack and if the number of threads could be a problem for stack space usage.

        Here's a snippet of code using the macros proposed:

        Changed char data[ctx->size + 16];

        server_callback()

        { char tagstr[1024]; char msg[10*1024]; PN_VLA(char, data, ctx->size + 16); ... PN_VLA_FREE(data); }

        I saw various web sites arguing the pros and cons of VLAs, so maybe it is a good idea to discuss VLAs in their own Jira thread.

        Show
        Mary hinton added a comment - Thanks for the feedback, We could substitute the alloca() for memory allocation that is not in a loop as long as the buffers are reasonable sizes. Since alloca() acquires new stack space each time it's called without reusing space from previous calls, it's not a good idea to use it inside a loop. The problem with anything being allocated out of the stack is the size of the allocation. Stack space can still run out and alloca() will not warn us of a problem. Using a loop in Linux containing a VLA to reallocate stack space, you can still get segmentation failure if the allocation is large enough. Also if large messages have been allocated on the stack, the memory can be reused, but my understanding is that the stack doesn't shrink. The only way to recover stack memory is when the thread exits. I notice proton is initially set up with a stack space size of 10000000, so it may be expecting lots of large messages. I didn't know how many threads are being created to handle buffers allocated out of the stack and if the number of threads could be a problem for stack space usage. Here's a snippet of code using the macros proposed: Changed char data [ctx->size + 16] ; server_callback() { char tagstr[1024]; char msg[10*1024]; PN_VLA(char, data, ctx->size + 16); ... PN_VLA_FREE(data); } I saw various web sites arguing the pros and cons of VLAs, so maybe it is a good idea to discuss VLAs in their own Jira thread.
        Hide
        Cliff Jansen added a comment -

        Mary, to make your question more clear, could you please attach a small representative code snippet from proton, showing the existing code and what it would look like after your proposed changes? Then people can more easily visualize the impact your changes would have on the existing code.

        I detect two separate questions that you are mingling:

        • How best to mimic VLAs in proton?
        • Are VLAs inherently too dangerous for proton to use, at least sometimes?

        The former is the one I would like to focus on. Presumably, if there is a design flaw of using stack (instead of heap/malloc) for temporary storage, that affects the existing proton C99 code as much as a C++ implementation. I would prefer that gets a separate JIRA and doesn't cloud the compiler related issue.

        Assuming the use of the VLA is appropriate in a particular instance, I believe an alloca from within C++ is appropriate. This has the benefit of requiring no associated free(). It has the downside that it doesn't support a realloc (at least on all platforms) so code that reuses a VLA with a new size is problematic. Hopefully those cases are few and can be reworked.

        Show
        Cliff Jansen added a comment - Mary, to make your question more clear, could you please attach a small representative code snippet from proton, showing the existing code and what it would look like after your proposed changes? Then people can more easily visualize the impact your changes would have on the existing code. I detect two separate questions that you are mingling: How best to mimic VLAs in proton? Are VLAs inherently too dangerous for proton to use, at least sometimes? The former is the one I would like to focus on. Presumably, if there is a design flaw of using stack (instead of heap/malloc) for temporary storage, that affects the existing proton C99 code as much as a C++ implementation. I would prefer that gets a separate JIRA and doesn't cloud the compiler related issue. Assuming the use of the VLA is appropriate in a particular instance, I believe an alloca from within C++ is appropriate. This has the benefit of requiring no associated free(). It has the downside that it doesn't support a realloc (at least on all platforms) so code that reuses a VLA with a new size is problematic. Hopefully those cases are few and can be reworked.
        Hide
        Cliff Jansen added a comment -

        I regret the title of this JIRA, as it unintentionally implies a narrow scope and may not attract the full intended audience. Visual Studio on Windows is indeed the working platform but the questions are really about building proton using C++ compilers in general.

        The underlying question is: how hard would it be to make proton-c work in C++ as an additional target language? Proton is written in C99, which has a large, but not perfect overlap with C+. So perhaps ultimately, the question boils down to: is the intersection of C+ and C99 a rich enough subset of C for long term proton-c work?

        For some perspective on Mary's work, she has been investigating this in detail (using VS2010) and she has found that the overall impact looks promisingly manageable. But some intrusive changes are required. Variable length arrays are one such issue and show up frequently in the code. It would be nice to have some community feedback on this issue in advance of an actual patch.

        Show
        Cliff Jansen added a comment - I regret the title of this JIRA, as it unintentionally implies a narrow scope and may not attract the full intended audience. Visual Studio on Windows is indeed the working platform but the questions are really about building proton using C++ compilers in general. The underlying question is: how hard would it be to make proton-c work in C++ as an additional target language? Proton is written in C99, which has a large, but not perfect overlap with C+ . So perhaps ultimately, the question boils down to: is the intersection of C + and C99 a rich enough subset of C for long term proton-c work? For some perspective on Mary's work, she has been investigating this in detail (using VS2010) and she has found that the overall impact looks promisingly manageable. But some intrusive changes are required. Variable length arrays are one such issue and show up frequently in the code. It would be nice to have some community feedback on this issue in advance of an actual patch.

          People

          • Assignee:
            Cliff Jansen
            Reporter:
            Mary hinton
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development