Attached is a standalone test program, b428.java, for experimenting with the bug, and a patch proposal, derby-428.diff.
The patch contains a server-side change, a client-side change, and a regression test.
The server-side change is to call ensureLength() in DDMWriter.startDDM(). The DDMWriter working buffer is designed to dynamically grow to accomodate the data being written; this dynamic growth is implemented using a coding rule which requires that all DDMWriter internal routines must call ensureLength to communicate the buffer size requirements of that routine prior to writing bytes into the buffer. StartDDM was missing the call to ensureLength. It was just luck that this hadn't caused any problems in the past; this particular bug exposed the problem in startDDM by causing the server to write a tremendous number of very small DDM records in a single correlated chain, which meant that eventually (around batch element 9000), startDDM tried to write past the end of the buffer without calling ensureLength first. Simple change, even if my explanation is not so clear
The client-side change is due to the fact that DRDA imposes a hard limit of 65535 elements in a single correlated request because the correlation identifier is a two byte unsigned integer. Without this change, what happens is that the correlation identifier wraps around when we go to write the 65536th element in the batch, and we start breaking DRDA protocol rules since DRDA requires that the correlation IDs in a single request be always increasing. The change in this patch proposal causes the client to throw an exception if it is asked to execute a batch containing more than 65534 elements. The reason for the number 65534, rather than 65535, is that the value 0xFFFF seems to be reserved for some special purpose.
Experimenting with the JCC driver, I discovered that it seems to reserve more than just 0xFFFF, but also 0xFFFE and 0xFFFD as special values; the largest number of elements that I could succcessfully execute in a single batch with the JCC driver is 65532. I don't know what is going on with those special values, unfortunately.
The regression test verifies that we can successfully execute a batch containing 65532 elements with both the Network Client and JCC drivers. The test also verifies that, if we are using the Network Client, then we get the expected exception if we try to execute a batch with more than 65534 elements.
Comments, suggestions, and feedback are welcome!