41846 – Canonicalization failed with some latin2 characters

Bug 41846 - Canonicalization failed with some latin2 characters

Summary: Canonicalization failed with some latin2 characters

Status:	CLOSED DUPLICATE of bug 41462

Alias:	None

Product:	Security - Now in JIRA
Classification:	Unclassified
Component:	Canonicalization (show other bugs)
Version:	unspecified
Hardware:	PC Windows XP

Importance:	P1 critical
Target Milestone:	---
Assignee:	XML Security Developers Mailing List

URL:
Keywords:

Depends on:
Blocks:

Reported:	2007-03-15 05:44 UTC by matej
Modified:	2007-09-19 12:32 UTC (History)
CC List:	0 users

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description matej 2007-03-15 05:44:36 UTC

Canonicalization failed with some latin2 characters '&#269;&#263;žš&#273;&#268;&#262;ŽŠ&#272;'(leters with
caron, ... ).

Release 1.3.0 don't have such problem.

Code which demonstrates bug:

// parse document
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
// text contains some latin2 characters '&#269;&#263;žš&#273;&#268;&#262;ŽŠ&#272;'
String text = new
String("<text>\u010D\u0107\u017E\u0161\u0111\u010C\u0106\u017D\u0160\u0110</text>");
Document doc = db.parse(new ByteArrayInputStream(text.getBytes("UTF-8")));
Element e_latin2 = doc.getDocumentElement();
Canonicalizer20010315WithComments c14 = new Canonicalizer20010315WithComments();
byte[] canon_bin = c14.engineCanonicalizeSubTree(e_latin2);

if (Arrays.equals(text.getBytes("UTF-8"), canon_bin))
   System.out.println("OK");
else
   System.out.println("Failed");

Comment 1 sean.mullan 2007-03-20 09:47:14 UTC

I can't reproduce this with the latest sources. Probably a dup of 41462.

*** This bug has been marked as a duplicate of 41462 ***

Comment 2 sean.mullan 2007-09-19 12:32:01 UTC

Closing old bugs.