In doing some performance investigations using qpid::messaging over 1.0, in particular as message size got larger, I saw much lower throughput and lots of cpu used. From callgrind it looked like this was from shuffliing up the buffer in pn_dispatcher_output. Because of the threading in qpid::messaging, it was possible for the application to generate too much output using the top-half of the engine API before the IO was done for the bottom half. Fixing that in qpid:messaging improved performance.
There may perhaps be something that proton could do to make users more aware of this (e.g. a log message if the buffer exceeds a certain size? or just documentation?)