Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
XML Commons Resolver 1.2.0
-
None
-
None
-
-
44426
Description
W3C gets an immense amount of DTD traffic with user-agent often only identifying
itself as Python or Java.
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic
In a number of cases we have heard back from people affected by our automated
blocking indicating they are running Xalan and/or Xerces doing such things as
validating XML or doing XSL transforms. We have directed some we have been in
correspondence with to your catalog instructions.
http://xerces.apache.org/xerces2-j/faq-xcatalogs.html
The vast majority of Xalan/Xerces installations most likely do not implement
catalogs nor caching of external DTDs and other schemata. It would seem the
resolver does not care about HTTP response codes nor caching directives.
http://www.ietf.org/rfc/rfc2616.txt
Better than a default catalog would be a caching XML Catalog resolver as I
understand is part of Glassfish
http://norman.walsh.name/2007/09/07/treadLightly
There are other Java libraries contributing to this traffic as well. Xalan and
Xerces are widely used, important libraries. Your assistance in reducing this
excessive traffic to W3C and others hosting standards schemata would be greatly
appreciated.