Xerces2-J
  1. Xerces2-J
  2. XERCESJ-727

cloneNode(true) Element are still linked to original Element

    Details

    • Type: Bug Bug
    • Status: Open
    • Resolution: Unresolved
    • Affects Version/s: 2.4.0
    • Fix Version/s: None
    • Component/s: DOM (Level 3 Core)
    • Labels:
      None
    • Environment:
      Operating System: Windows NT/2K
      Platform: PC

      Description

      NB : tested only under Windows 2000/ jdk 1.4.1_02
      -------

      • Create a DOM Element.
      • Create many threads, giving each of them a cloneNode(true) copy of the
        Element.
      • Start all the threads.
      • Each thread just finds the Element's embedded sub-Elements.

      The following Exceptions occur many times (but not always) :
      java.lang.NullPointerException
      at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
      at org.apache.xerces.dom.ParentNode.item(Unknown Source)
      at Tst2Threads.run(Tst2Threads.java:40)
      java.lang.NullPointerException
      at org.apache.xerces.dom.ParentNode.nodeListItem(Unknown Source)
      at org.apache.xerces.dom.ParentNode.item(Unknown Source)
      at Tst2Threads.run(Tst2Threads.java:41)

      where lines 40 and 41 are :
      if ( lesNoeuds.item.getNodeType() != Element.ELEMENT_NODE ) continue;
      noeud = (Element)(lesNoeuds.item( i ));
      in the following example.
      NB : if I create an different Element from file for each thread creation
      (different Element instances, not created from cloneNode) then no Exception
      occurs

      This seems a wrong behaviour, keeping Xerces from being used in multi-threading
      environment.
      I understand that Xerces is not thread safe, but how can I use same data in
      different threads if the cloning fails this way ?
      How can I write correct thread safe code using DOM data ?

      These symptoms seem close to bug #6885 but are NOT DUPLICATE at first sight :
      there is no concurrent access on the same instance in my example.

      Thanks to tell me if there is any error in my code. I am interested in any
      workaround for that behaviour...

      Francois.

      ***************************************
      Here follows a code example (sorry for poor comments .
      Usage :

      • just run it from command line,
      • enter a XML file name, Element used will be the first one read from file (I
        suggest a consequent Element),
      • enter a number of threads,
        ==> with 200 threads the Exception occurs almost immediately.
        ***************************************
        import org.w3c.dom.*;

      public class Tst2Threads extends Thread
      {
      static protected Document readDOM( String nomFichier ) throws Exception

      { java.net.URL url= new java.net.URL( nomFichier ); java.io.InputStream inputXML = url.openStream(); javax.xml.parsers.DocumentBuilderFactory factory = javax.xml.parsers.DocumentBuilderFactory.newInstance(); factory.setValidating( false ); factory.setNamespaceAware( false ); factory.setIgnoringComments( true ); javax.xml.parsers.DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse( inputXML ); return document; }

      /*******************************************************/
      Element element = null;

      public Tst2Threads(ThreadGroup gp, Element elt, String _nom)

      { super( gp, _nom); element = elt; }

      public void run()
      {
      NodeList lesNoeuds = element.getChildNodes();
      Element noeud;
      while (true)
      {
      for (int i=0, max=lesNoeuds.getLength(); i<max ; i++)

      { if ( lesNoeuds.item(i).getNodeType() != Element.ELEMENT_NODE ) continue; noeud = (Element)(lesNoeuds.item( i )); }

      }
      }

      // get first Element
      static Element readElmt(String nomFic) throws Exception
      {
      Document doc = readDOM( "file://"+nomFic );

      NodeList liste = doc.getChildNodes();
      boolean done = false;
      Element elmt = null;
      if (liste != null)
      {
      for(int i=0, max=liste.getLength(); (i<max) && (! done); i++)

      { if ( liste.item(i).getNodeType() != Element.ELEMENT_NODE ) continue; elmt = (Element)(liste.item( i )); done = true ; }

      }
      return elmt;
      }

      /*******************************************************/
      public static void main( java.lang.String[] args)
      {
      try
      {
      java.io.BufferedReader br=new java.io.BufferedReader( new
      java.io.InputStreamReader( System.in));

      System.out.print("XML file name for Element description
      [d:/tmp/menu.xml] :");
      String nomFic= br.readLine();
      if ("".equals(nomFic)) nomFic = "d:/tmp/menu.xml";
      Element elmt = readElmt(nomFic);

      System.out.print("Nb of threads [200] :");
      String s= br.readLine();
      if ("".equals(s)) s="200";
      int nbThreads = Integer.valueOf(s).intValue() - 1;

      System.out.println(" Starting for :\n file="nomFic"\n nb
      threads="+nbThreads);

      ThreadGroup groupe = new ThreadGroup("TEST");
      Tst2Threads t;
      for (int i=0; i<nbThreads; i++)

      { // Element cloned here (from main thread --v) t = new Tst2Threads(groupe, (Element)(elmt.cloneNode(true)), "Th"+i); t.start(); }

      }
      catch (Throwable t)

      { t.printStackTrace(); }

      }
      }

        Activity

        Hide
        Aaron Kardell added a comment -

        This issue may or may not be related to the issue reported by XERCESJ-525, of which I just submitted test cases and a patch for.

        Show
        Aaron Kardell added a comment - This issue may or may not be related to the issue reported by XERCESJ-525 , of which I just submitted test cases and a patch for.
        Hide
        Gili added a comment -

        Netbeans seems to be getting hit particularly badly due to this bug. Please see http://www.netbeans.org/issues/show_bug.cgi?id=50198 for more information.

        Can someone please assign someone to this bug?

        Show
        Gili added a comment - Netbeans seems to be getting hit particularly badly due to this bug. Please see http://www.netbeans.org/issues/show_bug.cgi?id=50198 for more information. Can someone please assign someone to this bug?
        Hide
        Michael Glavassevich added a comment -

        Xerces' DOM implementation is not thread safe [1]. There's no requirement that a DOM be thread safe, so applications need to make sure that threads are properly synchronized for concurrent access to the DOM. This is true even if you're just invoking read operations.

        [1] http://xml.apache.org/xerces2-j/faq-dom.html#faq-1

        Show
        Michael Glavassevich added a comment - Xerces' DOM implementation is not thread safe [1] . There's no requirement that a DOM be thread safe, so applications need to make sure that threads are properly synchronized for concurrent access to the DOM. This is true even if you're just invoking read operations. [1] http://xml.apache.org/xerces2-j/faq-dom.html#faq-1
        Hide
        Francois Abot added a comment -

        Michael : it is OK, Xerces is not thread safe.

        But for my multi-thread application, I want to duplicate an object in order to have an instance dedicated to each thread, then I can modify each object separately.
        This is absolutely impossible if cloneNode behaves like is does now.

        I do not ask Xerces to be thread safe.
        I only want Xerces to allow me to implement thread safe features, without making hypothesis on how I implement them.

        I think this bug is not related to thread safe feature of Xerces, but it forbids developers to implement it.

        Francois.

        Show
        Francois Abot added a comment - Michael : it is OK, Xerces is not thread safe. But for my multi-thread application, I want to duplicate an object in order to have an instance dedicated to each thread, then I can modify each object separately. This is absolutely impossible if cloneNode behaves like is does now. I do not ask Xerces to be thread safe. I only want Xerces to allow me to implement thread safe features, without making hypothesis on how I implement them. I think this bug is not related to thread safe feature of Xerces, but it forbids developers to implement it. Francois.
        Hide
        Michael Glavassevich added a comment -

        I still believe this is a threading problem. A cloned node has the same owner document as the original node, so they're part of the same DOM even though the new node doesn't have a parent yet.

        In DOM, there are two ways of traversing a node's children. One is to walk to the child/sibling chain with getFirstChild()/getNextSibling(). The other is to iterate over the node list returned from getChildNodes(). In Xerces DOM implementation, siblings are stored as a linked list. So to get from node 1 to n, you have to walk through all the nodes in between.

        To improve the performance of sequential lookups as well as getting the length, Xerces uses a cache. These caches which keep track of the last position accessed are managed by the owner document and get redistributed to other nodes when needed. If you're multi-threading with node lists, the cache could get swiped away from one and given to another while the first one is still using it, hence the NullPointerException.

        Show
        Michael Glavassevich added a comment - I still believe this is a threading problem. A cloned node has the same owner document as the original node, so they're part of the same DOM even though the new node doesn't have a parent yet. In DOM, there are two ways of traversing a node's children. One is to walk to the child/sibling chain with getFirstChild()/getNextSibling(). The other is to iterate over the node list returned from getChildNodes(). In Xerces DOM implementation, siblings are stored as a linked list. So to get from node 1 to n, you have to walk through all the nodes in between. To improve the performance of sequential lookups as well as getting the length, Xerces uses a cache. These caches which keep track of the last position accessed are managed by the owner document and get redistributed to other nodes when needed. If you're multi-threading with node lists, the cache could get swiped away from one and given to another while the first one is still using it, hence the NullPointerException.
        Hide
        Francois Abot added a comment -

        So how can I solve or workarounds my problem ?

        If I choose to have one document per thread, then :
        node2 = node.cloneNode(true);
        threadDocument.appendChild(node2);

        Would it be sufficient to reset the node2 cache ?

        NB : By the way, Crimson implementation executes "cloneNode" without this bad effect (I mean bad for me ).

        Show
        Francois Abot added a comment - So how can I solve or workarounds my problem ? If I choose to have one document per thread, then : node2 = node.cloneNode(true); threadDocument.appendChild(node2); Would it be sufficient to reset the node2 cache ? NB : By the way, Crimson implementation executes "cloneNode" without this bad effect (I mean bad for me ).
        Hide
        Michael Glavassevich added a comment -

        Walking the child/sibling chain should work since it doesn't cause any internal writes. You could change the code which uses NodeLists to something like:

        Node current = element.getFirstChild();
        while (current != null)

        { /** do something. **/ current = current.getNextSibling(); }

        ... though really the best thing to do is to synchronize on the DOM.

        Show
        Michael Glavassevich added a comment - Walking the child/sibling chain should work since it doesn't cause any internal writes. You could change the code which uses NodeLists to something like: Node current = element.getFirstChild(); while (current != null) { /** do something. **/ current = current.getNextSibling(); } ... though really the best thing to do is to synchronize on the DOM.

          People

          • Assignee:
            Unassigned
            Reporter:
            Francois Abot
          • Votes:
            1 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development