Bug 47353 - Bug in canonicalization from an XPathNodeList
Summary: Bug in canonicalization from an XPathNodeList
Status: CLOSED FIXED
Alias: None
Product: Security - Now in JIRA
Classification: Unclassified
Component: C++ Canonicalization (show other bugs)
Version: C++ 1.5.1
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: XML Security Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-11 08:08 UTC by John Keeping
Modified: 2009-08-27 13:43 UTC (History)
0 users



Attachments
Test case which demonstrates the buggy behaviour (4.43 KB, text/plain)
2009-06-11 08:09 UTC, John Keeping
Details
Test file for use with the above program (827 bytes, text/plain)
2009-06-11 08:11 UTC, John Keeping
Details

Note You need to log in before you can comment on or make changes to this bug.
Description John Keeping 2009-06-11 08:08:31 UTC
When canonicalizing a subsection of an XML file using a XSECC14n20010315 canonicalizer, if the section is set via setXPathMap and the elements in the XPath set have no namespace prefix but are in a namespace defined on a parent element in the original document then the canonicalized output is incorrect.

For example, given the following XML:

<Document xmlns="http://www.example.com/document">
  <Data id="data1">
    <FileName>image.jpeg</FileName>
    <FileType>image/jpeg</FileType>
  </Data>
</Document>

if an enveloped signature transform is applied to #data1 (ignoring for now the lack of signature there...), the result of canonicalization with xml-security-c is:

<Data xmlns="http://www.example.com/document" id="data1">
    <FileName xmlns="http://www.example.com/document">image.jpeg</FileName>
    <FileType xmlns="http://www.example.com/document">image/jpeg</FileType>
  </Data>

but it should be:

<Data xmlns="http://www.example.com/document" id="data1">
    <FileName>image.jpeg</FileName>
    <FileType>image/jpeg</FileType>
  </Data>

I will attach a test program and sample file which demonstrate this.
Comment 1 John Keeping 2009-06-11 08:09:33 UTC
Created attachment 23794 [details]
Test case which demonstrates the buggy behaviour
Comment 2 John Keeping 2009-06-11 08:11:35 UTC
Created attachment 23795 [details]
Test file for use with the above program

This file can be used to test behaviour with the above test program as follows:

./test test.xml data1 test.out

The signature and digest are certainly corrupt and incorrect, but it suffices for demonstrating the problem.
Comment 3 John Keeping 2009-06-11 08:21:56 UTC
I was about to attach the same patch I posted to the mailing list at http://article.gmane.org/gmane.text.xml.security.devel/6707 but I've thought about it a bit more and I think the correct solution probably involves a change in XSECXMLNSStack::printNamespace. The problem is that namespace attribute nodes for the default namespace prefix (i.e. no prefix) are never marked as printed, but in fact they are and don't need to be printed again.

I assume that there's a good reason why they aren't marked as printed though, so I doubt it is as simple as removing the m_isDefault check.
Comment 4 Scott Cantor 2009-06-11 08:32:43 UTC
That's why I'm reluctant to fix it, since I didn't write any of the code. Let me see if we can poke Berin about it and have him review it.
Comment 5 Scott Cantor 2009-06-13 10:28:31 UTC
Confirmed with latest (soon to be 1.5.0) code and without Xalan.

Doesn't occur if the canonicalizer is run directly on the marked element, so definitely requires the XPath "simulation" code introduced by the transform to illustrate the bug.
Comment 6 Scott Cantor 2009-06-13 13:12:19 UTC
After trying to understand the code, you're definitely correct that the fix isn't going to be just removing the check inside the printNamespace method.

My reading of the algorithm is that there's a very complex set of interactions between the namespace stack and the logic to handle default namespaces. The point of the stack is return the set of "in scope" namespaces that haven't been printed yet. The isDefault flag is used to ensure that the default namespace in scope is always returned by the stack, even if it's actually been printed.

That in turn causes the stack-based output logic to see a non-empty default declaration and set the xmlnsFound flag to true, which then *prevents* the code down below that might output xmlns="" from running.

Altering the printNamespace logic causes it to skip the default namespace declaration in the stack, never returning it, which then skips the code that notices it's in scope. At that point the xmlns="" logic runs, and because it finds the non-empty default declaration up above in the tree, it outputs xmlns="", which obviously is incorrect.

Without a total overhaul, I would have to assume that your original patch is closer to the mark, even though I have no way to know what interactions that might cause.

Alternatively, something would need to be added to the stack interface to allow the xmlns="" logic to use it even if the first/next methods don't return the default declaration, which would in turn fix the extra printing you're seeing.
Comment 7 Scott Cantor 2009-07-20 14:12:31 UTC
Fix in subversion:
http://svn.apache.org/viewvc?view=rev&revision=796012