When canonicalizing a subsection of an XML file using a XSECC14n20010315 canonicalizer, if the section is set via setXPathMap and the elements in the XPath set have no namespace prefix but are in a namespace defined on a parent element in the original document then the canonicalized output is incorrect. For example, given the following XML: <Document xmlns="http://www.example.com/document"> <Data id="data1"> <FileName>image.jpeg</FileName> <FileType>image/jpeg</FileType> </Data> </Document> if an enveloped signature transform is applied to #data1 (ignoring for now the lack of signature there...), the result of canonicalization with xml-security-c is: <Data xmlns="http://www.example.com/document" id="data1"> <FileName xmlns="http://www.example.com/document">image.jpeg</FileName> <FileType xmlns="http://www.example.com/document">image/jpeg</FileType> </Data> but it should be: <Data xmlns="http://www.example.com/document" id="data1"> <FileName>image.jpeg</FileName> <FileType>image/jpeg</FileType> </Data> I will attach a test program and sample file which demonstrate this.
Created attachment 23794 [details] Test case which demonstrates the buggy behaviour
Created attachment 23795 [details] Test file for use with the above program This file can be used to test behaviour with the above test program as follows: ./test test.xml data1 test.out The signature and digest are certainly corrupt and incorrect, but it suffices for demonstrating the problem.
I was about to attach the same patch I posted to the mailing list at http://article.gmane.org/gmane.text.xml.security.devel/6707 but I've thought about it a bit more and I think the correct solution probably involves a change in XSECXMLNSStack::printNamespace. The problem is that namespace attribute nodes for the default namespace prefix (i.e. no prefix) are never marked as printed, but in fact they are and don't need to be printed again. I assume that there's a good reason why they aren't marked as printed though, so I doubt it is as simple as removing the m_isDefault check.
That's why I'm reluctant to fix it, since I didn't write any of the code. Let me see if we can poke Berin about it and have him review it.
Confirmed with latest (soon to be 1.5.0) code and without Xalan. Doesn't occur if the canonicalizer is run directly on the marked element, so definitely requires the XPath "simulation" code introduced by the transform to illustrate the bug.
After trying to understand the code, you're definitely correct that the fix isn't going to be just removing the check inside the printNamespace method. My reading of the algorithm is that there's a very complex set of interactions between the namespace stack and the logic to handle default namespaces. The point of the stack is return the set of "in scope" namespaces that haven't been printed yet. The isDefault flag is used to ensure that the default namespace in scope is always returned by the stack, even if it's actually been printed. That in turn causes the stack-based output logic to see a non-empty default declaration and set the xmlnsFound flag to true, which then *prevents* the code down below that might output xmlns="" from running. Altering the printNamespace logic causes it to skip the default namespace declaration in the stack, never returning it, which then skips the code that notices it's in scope. At that point the xmlns="" logic runs, and because it finds the non-empty default declaration up above in the tree, it outputs xmlns="", which obviously is incorrect. Without a total overhaul, I would have to assume that your original patch is closer to the mark, even though I have no way to know what interactions that might cause. Alternatively, something would need to be added to the stack interface to allow the xmlns="" logic to use it even if the first/next methods don't return the default declaration, which would in turn fix the extra printing you're seeing.
Fix in subversion: http://svn.apache.org/viewvc?view=rev&revision=796012