Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-921

SAXParser beheading some strings

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Resolution: Incomplete
    • 2.6.0
    • None
    • SAX
    • None
    • Operating System: Linux
      Platform: PC
    • 27807

    Description

      Platforms tested are AIX and Gentoo Linux.

      I have a java parser that implements ContentHandler and uses SAXParser
      to create a tab-delimited file of a subset of information in an XML file.

      My problem is that small percentages of the results from this code are being
      beheaded, by which I mean the string that's being returned is a subset of what's
      actually in the XML, with characters missing from the front of the string.

      My original XML file is 566+ MBs. I have managed to pare this down to about
      a 4 MB file, but haven't yet found a way to reproduce the problem on a smaller
      file.

      The following urls link to the xml file and the two java files used to parse the
      xml into the tab-delimited output:

      227.xml
      https://www.slashtmp.iu.edu/public/download.php?FILE=aarenson/7404E5qOli

      BindParserInter.java
      https://www.slashtmp.iu.edu/public/download.php?FILE=aarenson/26035zBIzer

      BindHandlerInter.java
      https://www.slashtmp.iu.edu/public/download.php?FILE=aarenson/897296MWT3R

      The following should compile the code and parse the xml:
      > javac BindParserInter.java
      > javac BindHandlerInter.java
      > java BindParserInter 227.xml > 227.txt

      The 227.xml file has 227 BIND-Interaction elements. The last one has the
      following subelement:

      <Org-ref_taxname>Mus musculus</Org-ref_taxname>

      After producing the tab-delimited file, the error I'm seeing is that the last
      line in the tab-delimited file contains only 'ulus' in the 7th field.

      Attachments

        Activity

          People

            Unassigned Unassigned
            arenson@spatzel.net Andrew D. Arenson
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: