Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1604

Big CDATA section cause a loop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • DOM (Level 3 Core)
    • None

    Description

      Parsing a document with a big CDATA section (about 25MB) with the feature "http://apache.org/xml/features/dom/defer-node-expansion" set to false, cause an infinite loop.

      Use this test class on the attachment.

      package test;

      import java.io.File;
      import java.io.FileInputStream;
      import java.util.concurrent.TimeUnit;

      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.xpath.XPathExpression;
      import javax.xml.xpath.XPathFactory;

      import org.w3c.dom.Document;

      public class BigCData {

      public static void main(String[] args) throws Exception {

      DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(
      "org.apache.xerces.jaxp.DocumentBuilderFactoryImpl", BigCData.class.getClassLoader());
      factory.setNamespaceAware(true);
      factory.setValidating(false);
      try

      { factory.setFeature("http://apache.org/xml/features/dom/defer-node-expansion", false); factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); // factory.setAttribute("http://apache.org/xml/properties/input-buffer-size", new Integer(100000000)); }

      catch (Throwable ex)

      { System.err.println("Cannot set IGNORE DTD feature. You can have performace problems."); }

      XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//style");

      long t0 = System.nanoTime();
      File file = new File("/Users/maurizio/Downloads/web_dossier_mars.xml");
      Document doc = factory.newDocumentBuilder().parse(new FileInputStream(file));
      long dt = System.nanoTime() - t0;

      System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " parse > " + doc);

      String nodeValue = xpath.evaluate(doc);

      dt = System.nanoTime() - t0;
      System.out.println(TimeUnit.NANOSECONDS.toMillis(dt) + " xpath > " + nodeValue.length());
      }

      }

      Attachments

        1. web_dossier_mars.rar
          24 kB
          Maurizio Merli

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dashie Maurizio Merli
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: