I would prefer to remove the hardcoded core providers from source code. In Java is a standard mechanism (so called Service Provider framework) that can be used to find out all codecs that ship with all given JAR files in classpath. This makes it easy to add custom codecs, you just add the JAR file to class path and it is available.
If you like I could code the lookup code (unfortunately its only "standardized" in Java 6, but its available in a different public class since Java 1.2. It is mainly used by:
- XML, XSLT (all of javax.xml)
- image formats (png, gif,..)
In general it works very simple:
The JAR file contains a MANIFEST that lists all classes that implement a codec under a key that is the class name of the abstract base class. A simple example is: if you plug xercesImpl.jar into your classpath, it's manifest contains a javax.xml.dom.DocumentBuilder=someClass. Based on this information DocumentuilderFactory returns a suitable implementation of a DOM parser. The same would be for Lucene, the MANIFEST of lucene-core.jar would contain a simple list of classes (all of them are returned to the provider!). If you then add the JAR file of contrib-misc to it, also the AppendingOnlyCoded would automatically be available.
Implementation is quite simple: you can ask the service provider API for the above key (in our case a oal.index.Codec-like one) and the codec provier returns an Iterator of implementation classes. Those would get registered on Startup of DefaultCodecProvider.