The talk "Effective DoS attacks against Web Application Plattforms - #hashDoS" given at the "chaos communication congress (28c3)" last week showed that many web applications are vulnerable to hash collisions in POST parameters. Descriptions of the problem can be found at https://cryptanalysis.eu/blog/2011/12/28/effective-dos-attacks-against-web-application-plattforms-hashdos/ and http://permalink.gmane.org/gmane.comp.security.full-disclosure/83694
I wanted to determine if xerces would als be affected by hash collision attacks, so I prepared a document of 2MB consisting of a single root element and about 125000 attributes having the same java.lang.String#hashCode. Parsing this document with xerces 2.9.1 on an i7 2620 notebook took about 8 minutes with one core at 100% cpu usage. According to the Netbeans profiler 56% of that was spent inside org.apache.xerces.util.SymbolTable#addSymbol and another 42% in org.apache.xerces.util.XMLAttributesImpl#checkDuplicatesNS.
This behaviour can also be triggered by webservice calls and so is a serious problem. The workaround in Tomcat was to impose a limit on the maximum number of parameters in a post request, perhaps a similar setting could be introduced, configurable by a JAXP parser feature.
I can provide the xml file showcasing this problem but I would prefer to not post it to a public bug tracker.