Description
When reading N characters at a buffer boundary, incorrect logic results in reading one byte past the end of the buffer which leads to unpredictable parse errors.
Create an XML file > 16KB (16384) bytes and parse using guththila while running under valgrind (I'm using valgrind 3.5). Adjust the content around the 16384th byte until you see an invalid read error from valgrind. I added and removed some character content.
The 16KB value derives from the guththila buffer size (GUTHTHILA_BUFFER_DEF_SIZE). Decreasing this value (to 512 or 1024) may result in regressions in existing guththila and axis2c test suites. I'm guessing there's very limited testing of XML messages > 16KB which is why this bug has survived so long.
The "if" statements in guththila_next_no_char() make incorrect use of the "no" variable. Here's the fix I applied:
— /home/steve/src/guththila-svn/src/guththila_xml_parser.c 2010-03-19 12:13:45.000000000 -0700
+++ guththila_xml_parser.c 2010-03-22 14:31:06.000000000 -0700
@@ -1773,8 +1821,8 @@
}
else if(m->reader->type == GUTHTHILA_IO_READER || m->reader->type == GUTHTHILA_FILE_READER)
{
- if(m->next < GUTHTHILA_BUFFER_PRE_DATA_SIZE(m->buffer)
- + GUTHTHILA_BUFFER_CURRENT_DATA_SIZE(m->buffer) + no && m->buffer.cur_buff != -1)
+ if(m->next + no <= GUTHTHILA_BUFFER_PRE_DATA_SIZE(m->buffer)
+ + GUTHTHILA_BUFFER_CURRENT_DATA_SIZE(m->buffer) && m->buffer.cur_buff != -1)
{
for(i = 0; i < no; i++) { @@ -1784,8 +1832,8 @@ return (int)no; /* We are sure that the difference lies within the int range */ } - else if(m->next >= GUTHTHILA_BUFFER_PRE_DATA_SIZE(m->buffer)
- + GUTHTHILA_BUFFER_CURRENT_DATA_SIZE(m->buffer) + no && m->buffer.cur_buff != -1)
+ else if(m->next + no > GUTHTHILA_BUFFER_PRE_DATA_SIZE(m->buffer)
+ + GUTHTHILA_BUFFER_CURRENT_DATA_SIZE(m->buffer) && m->buffer.cur_buff != -1)
{
/* We are sure that the difference lies within the int range */
if(m->buffer.cur_buff == (int)m->buffer.no_buffers - 1)