Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.6.2, 2.9.1
-
None
Description
Under certain conditions, the contains() method in XMLSchemaValidator$ValueStoreBase can cripple the performance of parsing and validation.
I'm not sure what those conditions are, but as a guideline figure I was using JAXB2 to deserialize a 22meg XML file. Without schema validation, it took 5 seconds. With validation, it took over 3 minutes (JDK 1.5.0_10 on win32). My profiler pointed the finger squarely at that method XMLSchemaValidator.
Suspicions were aroused further when seeing this comment in the source:
public boolean contains() {
// REVISIT: we can improve performance by using hash codes, instead of
// traversing global vector that could be quite large.
This is present in Xerces 2.6.2 contained with JDK1.5.0_10, and also in the source for 2.9.1.