|
[
Permlink
| « Hide
]
elharo added a comment - 08/Apr/05 09:32 PM
Xerces is correct. This URI is syntactically incorrect according to RFC 3986. The authority component cannot have two colons when used with an IPv4 literal address. In essence, this URI tries to have two ports. I'm not familiar with the spoec you reference, but it does not appear to be conformant to the URI specification.
The value space for anyURI [1] is defined by RFC 2396 (and RFC 2732). dcp.tcp.pft://192.168.0.1:1002:3002?fec=1&crc=0 is allowed by the grammar since "192.168.0.1:1002:3002" matches reg_name. Registry-based Naming Authority (reg_name) has been supported since Xerces 2.6.0.
authority = server | reg_name reg_name = 1*( unreserved | escaped | "$" | "," | ";" | ":" | "@" | "&" | "=" | "+" ) unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" If RFC 3986 prohibits this URI then it seems the new RFC is not backwards compatible with RFC 2396. [1] http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#anyURI RFC 3986 does seem to prohibit this URI. In 3986 we have:
reg-name = *( unreserved / pct-encoded / sub-delims ) unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "=" In 3986 the colon is a general delimiter, not a sub delimiter. I'm not sure what should be done here. Schemas Part 2 normatively references 2396, not 3986; so I suppose this should be allowed. On the other hand, I can't help but think that this is really a bug in the definition of reg-names in 2396. The colon isn't the only issue. The @ sign is also prohibited in reg-names in 3986 and allowed in 2396. I wonder what the working group was thinking? I suspect they were trying to make it easier to distinguish reg-names from host-based authorities, and allow user info and port to be specified for registry based authorities.
This particular issue is not listed in Appendix D2 of 3986, Modifications, so I wonder if the working group noticed it? Roy Fielding has confirmed that this was a deliberate decision, and is indeed an incompatibility between 2396 and 3986. According to him, "No URI schemes were defined using the reg_name syntax of 2396, and therefore it was removed." Probably, nobody should be using such syntax now.
What to do now? This is a tough call, but I tend to fall back on the letter of the law (or the spec). The schemas spec references 2396, not 3986. Therefore Xerces should be changed to allow this syntax. This might change in schema 1.1 though, which will likely reference 3986, not 2396. However, the current working draft still references RFC 2396. I've asked the schema working group to consider this issue. Just to clarify... The report was opened against Xerces 2.5.0. Xerces has allowed the reg_name syntax since Xerces 2.6.0, so as of today the schema validator will accept dcp.tcp.pft://192.168.0.1:1002:3002?fec=1&crc=0 as a valid value of type anyURI.
Interesting - you guys are doing a great job.
I raised the issue against 2.5 because that is what Stylus Studio ships with. Given the comment from Roy Fielding its a shame that the Digital Radio Mondiale team who wrote ETSI TS 102 821 didn't register their URI schema. In fact this URI schema has some practical problems and it being esentially deprecated by RFC 3986 is an argument I can use to try and get it changed. Luckily Annex C is informative, not normative. I will try and get the ETSI spec changed and definately not incorporate the deprecated URI in the XML schema we are designing. We have a Digital Radio Mondiale meeting next week where I should be able to get this going. I think XERCES should follow (and track) the W3C standard so the current 2.6 behaviour is correct, the 2.5 behaviour is incorrect and if W3C changes the schema spec to reference 3986 then XERCES should change with it. As far as this bug is concerned it doesn't seem like you need to implement anything but it would be nice if the exact syntax accepted by the tool was somewhere in the user documentation (if it is then my mistake but I couldn't find it by googling). Julian reg_name has been accepted as valid URI syntax since Xerces 2.6.0. This may change in the future if XML Schema 1.0 moves up to the RFC 3986 syntax which excludes this production. Xerces CVS currently supports the XML Schema 1.0, 2nd edition which still references RFC 2396 for the anyURI type. The version of XML Schema 1.0 supported will be clearly marked in the documentation. The relevant RFCs for anyURI may be emphasized in a FAQ.
Michael Glavassevich made changes - 09/May/05 12:46 PM
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||