I'm getting the following stack dump... There appears to be some set of XML that causes this error (sorry I am not able to figure out the XML that causes it...) java.lang.InternalError: fillbuf java.lang.InternalError: fillbuf at org.apache.crimson.parser.InputEntity.parsedContent(InputEntity.java (Compiled Code)) at org.apache.crimson.parser.Parser2.content(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.content(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.content(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.content(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.content(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.maybeElement(Parser2.java(Compiled Code)) at org.apache.crimson.parser.Parser2.parseInternal(Parser2.java (Compiled Code)) at org.apache.crimson.parser.Parser2.parse(Parser2.java(Compiled Code)) at org.apache.crimson.parser.XMLReaderImpl.parse(XMLReaderImpl.java (Compiled Code))
Edwin - you can decide if you want to track Crimson-related problems here or not, since Crimson doesn't have it's own bugzilla category yet, does it?
To fix this, I need some way of reproducing the bug. Please provide a simple reproducible test case. Also, this could be fixed already. Have you tried the latest version of Crimson 1.1.3?
The following patch fixes the problems, I've tested it as well as I can. This is the second problem (other was CDATA problem) which is caused by "end of buffer" conditions. I have emailled Edwin a test harness for detecting these problems. Index: InputEntity.java =================================================================== RCS file: /home/cvspublic/xml- crimson/src/org/apache/crimson/parser/InputEntity.java,v retrieving revision 1.3 diff -u -p -w -r1.3 InputEntity.java --- InputEntity.java 2001/09/29 00:44:34 1.3 +++ InputEntity.java 2001/11/27 23:56:06 @@ -530,6 +530,7 @@ final class InputEntity implements Locat // ']]>' is a WF error -- must fail if we see it if (c == ']') { + int cdataCheckPosAfterFillBuff = 0; switch (finish - last) { // for suspicious end-of-buffer cases, get more data // into the buffer to rule out this sequence. @@ -541,18 +542,44 @@ final class InputEntity implements Locat case 1: if (reader == null || isClosed) continue; - if (last == first) - throw new InternalError ("fillbuf"); - last--; - if (last > first) { + // ok, we'll need to do a fillbuf in order to read ahead here + // this is number of chars we'll need to check after the fillbuf + int cdataCheckCount = finish - last == 2 ? 1 : 2; + + // consume the available buf chars + { validator.text (); - contentHandler.characters (buf, first, last - first); + contentHandler.characters (buf, first, finish-first); sawContent = true; - start = last; } - fillbuf (); + // refill the buffer and deal with eof + { + first = last = start = finish; + // fillbuf called + if(isEOF()) { + // = start because of a pushback char at the start, not 0! first = last = start; continue; + } + first = last = start; + } + // check if the refilled buffer has the bad chars + { + // not enough chars + if(finish<cdataCheckCount) + continue; + if(cdataCheckCount==1) { + if(buf[0]=='>') + fatal ("P-072", null); + } else { + if(buf[0]==']' && buf[1]=='>') + fatal ("P-072", null); + } + // nb: we haven't consumed these chars yet, that will happen later + } + // last is where we want it right now, the for loop will do a ++ next + last--; + continue; // otherwise any "]]>" would be buffered, and we can // see right away if that's what we have
*** Bug 4947 has been marked as a duplicate of this bug. ***
Please fix this and get the fix to Sun for inclusion with a Java release! I have a customer who uses ]] frequently in text data. They have tripped over this bug twice.
Has this bug been fixed?
Ancient bug report against Crimson. Not an XML Commons issue. Note that Crimson which has had no activity in years will likely soon be retired to the Apache Attic. It's unlikely that any problems reported against it will ever be fixed.