Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
27423
Description
Version 1.3final.
The meta tag parsing in the demo HTML parser
(demo/org/apache/lucene/demo/html/HTMLParser.jj) incorrectly relies on the meta
tag's "name" attribute coming before its "content" attribute. In XML/HTML,
attribute order is supposed to be insignificant.
So, if I have tags:
<meta content="blah" name="blarg" />
<meta content="gluh" name="glarg" />
...the parser will not parse them correctly. (In fact, it will simply fill in
name/content pairs as it encounters attributes in the stream, without regard to
which meta tags the attributes are actually in. So, in the above example, I will
get one meta property of "blarg"="gluh".)
This is a problem because my XSLT happens to result in meta tags with attributes
in the above order.
It may not seem like a big deal since it's in demo code, but because
HTMLParser.jj is many times faster than more heavy-weight solutions, I'd love
for this to be fixed, if possible.