Apologies if I put this in the wrong spot. I'm attaching a patch (against current trunk) that adds support for a 'catenateShingles' option to the WordDelimiterFilter.
We (National Library of Australia - NLA) are currently maintaining this as an internal modification to the Filter, but I believe it is generic enough to contribute upstream.
Includes unit tests, and as is noted in one of them CATENATE_WORDS and CATENATE_SHINGLES are logically considered mutually exclusive for sensible usage and can cause duplicate tokens (although they should have the same positions etc).
I'm happy to work on it more if anyone finds problems with it.