I admit up front I'm biased toward the dependency management case. From my perspective it's a pain to have to dig into the dependencies and exclude all the ones I don't want.
In the end, I think the key question is "what's the common case?" Is it more common to need a lot of parsers, or just one or two? If it's the former, I think keeping a single jar makes a lot of sense. If it's one or two, then I think having separate jars makes things better, because end-users have a clear path: only care about AutoCAD? Take the DWGParser jar and you're done.
Alternatively, there are other Maven-level options that could be considered that would be an improvement on the current state:
1. Make all of the dependencies of tika-parsers 'optional', except for tika-core. This more closely matches the non-dependency-managed scenario, where the end user is responsible for making sure he or she has all the required dependencies for the parser in question.
2. Create pom-only modules for each parser, that pre-document the depenedency filter. In other words, for each parser 'foo', create a tika-parser-foo pom that depends on tika-parsers but excludes the dependencies that are not needed by that parser. This saves each end user from the work of figuring out the exclusion list by themselves.
Since I'm making the request, I'm happy to volunteer myself for some of the grunt-work for any of these solutions, if resources are needed to get them done.