Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1015

Zero byte read on InputStream causes false SAXParseException when LF at end of file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 2.6.0, 2.6.2
    • None
    • SAX
    • None
    • Windows XP, 586

    Description

      Under the following circumstances false SAXParseExceptions are thrown.

      When a document is being parsed and has reached the end of the document.
      If the InputStream supplying the document returns a 0 length indicator (meaning no bytes currently available) AND
      There is a Line Feed character at the end of the file, after the last element (without a carriage return).
      Then a false SAXParseException with one of the two following messages is thrown:
      "Content is not allowed in trailing section"
      or
      "The markup in the document following the root element must be well-formed"

      This could happen if the InputStream is reading from a buffer whose writer thread has returned a 0 instead of -1 when the buffer is closed. While the writer to the buffer should return a -1 on buffer close, in practice this may not always happen and Xerces should cater for it. It certainly should not throw a spurious exception.

      If a zero byte count is returned by an InputStream read while reading in the middle of the document an ArrayIndexOutOfBounds exception is thrown.

      I will include sample code to demonstrate both of these cases and a possible fix for the problem.

      Attachments

        1. RepeatingParser.java
          7 kB
          larry oneill

        Activity

          People

            Unassigned Unassigned
            loneill larry oneill
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: