Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Patch
Description
As far as I can tell, it will not allow the following in the path/query of an URL:
"&", ";", "=" (query string)
"+", "%" (encoded characters)
"." (extensions)
There are several others.
In addition, particular hosts are not valid due to a lack of country code:
- localhost
- http://xn--rsum-bpad.example.org (from IRIs)
- 10.1.1.1
My understanding of the URI specification (http://tools.ietf.org/html/rfc3986) is that the following delimiters are valid unencoded: :/@!$&'()*+,;=, and the following characters are also allowed: .-_~, as well as pct-encoded %xx
I've attached a patch to allow the extra characters, and to use those definitions for the userinfo and host as allowed in the spec. I've also broken out path, query and fragment explicitly.
There are still several other valid URIs that this won't allow (e.g. file:///..., IPv6 addresses), and there's a chance that the server-side validation (using java.net.URL) will differ to the client side - so it may be good to allow URL validation to be deferred to the server as an option as well.