As Benson mentioned, a pretty typical deployment scenario is one where you want to extend Tika with a few custom Parser classes. Currently you'd either need to maintain a custom version of the full configuration file, or do some CompositeParser magic to inject your custom parsers at runtime. Neither option is ideal.
Another concern of mine is that the current configuration mechanism disconnects the list of supported media types from the parser implementation class. It would be better if that list was maintained in the same Java source file instead of in the XML configuration.
Thinking further, there's some interest in making Tika easy to use in more dynamic environments like an OSGi container where new parser components may be added to or removed from the system at any time. A static configuration file does not work that well in such situations.
So my idea is to move the list of media types supported by a Parser class to a class annotation (or perhaps a getSupportedTypes() method that would work better with composite parsers) and replace the tika-config.xml file with a META-INF/services/org.apache.tika.parser.Parser file that simply lists all the Parser implementations within that jar file.