After I wrote my comment of 24/Nov/09 07:59 PM , I looked at the Java API because I came to wonder whether unescaping and using the Java API could be made to work by itself. I did look for alternatives before I created my big regular expression.
The big problem is that Java doesn't really present any API that distinguishes numeric IP addresses from symbolic addresses. Although InetAddress.getByName(String) must have some means of parsing an IPV4 and IPV6 literal numeric address, this functionality is not presented to java.net.* users. InetAddress.getByName(String) will parse either a numeric address or a symbolic name and produce indistinguishable results. That piece of the API does not give us a means to distinguish the two. I was unable to find any other API that did make the distinction.
The formats of numeric literal IPV4 and IPV6 internet addresses are fixed in RFCs and are extremely unlikely to be changed in the foreseeable future. We are therefore not exposed to any non-future-proofing. The only exposure we have is a possible future IPV8, but the ICANN is doing its best to make that unnecessary for a very long time.
Considering that Apache already owns this regular expression we should consider using it.
I considered the simpler approach of considering any address that contains a colon character to be a numeric IPV6 address, but colons are used as other punctuation, ie., separation between IP address and port number. That solution felt to me to be too brittle and accident-prone, and doesn't solve the IPV8 problem. There is a continuum of IPV6 solutions ranging from "look for a colon" to the correct regular expression you see here, and no principled way to decide where to stop.