The EmailValidator class, specifically, the "protected boolean isValidSymbolicDomain(String domain)" method makes an assumption on the PERL RegEX rules, specifically, that no more than 10 domains/subdomains may be specified in an email address. I.e. an email address of "foo@bar.2.3.4.5.6.7.8.9.com" is valid according to the EmailValidator whereas an email address of "foo@bar.1.2.3.4.5.6.7.8.9.com" causes isValidSymbolicDomain(String) to throw an ArrayIndexOutOfBoundsException because the "domainSegment" local variable is hard-coded to have a length of 10.
Whether or not this is due to a limitation in PERL w.r.t. the maximum number of allowed groupings, I do not know, but the RFC for email addresses does not appear to specify a maximum number. Additionally, although I couldn't find it in the RFC, Wikipedia says that the maximum number of characters for the domain name is 255 - though I am very hesitant to cite/use Wikipedia as an official source...
Granted, I've never seen a domain name w/ more than 5 subdomain names, let alone 10, but it seems like it should be supported regardless.
I'd submit a patch, but I wanted to discuss possible courses of action and determine the "right" (or at least acceptable) one. Possible solutions are:
1. check if the counter i in the for loop is > 10 and perform some action that stops the iterative process.
2. if the max number of groupings in PERL RegEX is 10, maybe we shouldn't use RegEX to determine the groupings.
3. if, per the RFC, the max number of domain name groupings is 10, then the code should check for this.
Please let me know if you 1) have an alternative solution and 2) want me to code a/the fix.
VALIDATOR-202.Currently the plan is to refactor the remaining validation routines into the new "routines" package, removing the dependency on Jakarata ORO (using JDK 1.4 built in regular expression support). As part of that I plan on creating a "domain validator" that both the Email and URL validators share. There have been a number of issues that have come up like this that get fixed in one of them but not the other - having a common "domain validator" should resolve this.