Here's a patch that addresses the issue with MB vs. mb and handles ignoreCase setting better (I think). Would appreciate some feedback on this before committing.
The main change here is that I'm preserving case using an inner class named CasePreservingSynonymMappings regardless of the ignoreCase setting. Here's an example of how that looks at runtime:
Thus, the ignoreCase setting isn't applied on store, rather it is applied when the managed synonym mappings data is "viewed". For instance, a get request for the "MB" child with ignoreCase==true would yield a merged list, such as:
This brings me to my first question. Should we only return one form when mappings overlap as is the case with "Megabyte" and "megabyte"? Right now, it returns both forms but with ignoreCase==true, maybe it should return only one of those? Again this is a view and both are stored so if you switch ignoreCase, then there's no information lost.
If ignoreCase is false and you request the mappings for "MB", then you just get:
It follows that if ignoreCase == false and the client asks for "Mb", then they get a 404.
The second question is about switching the ignoreCase setting, which the API allows. The previous code used to rebuild the map, but now that's not needed since we store the data as it was added and only apply the ignoreCase setting when the data is viewed. Am I overlooking something here?
Lastly, you'll notice I'm applying the ignoreCase setting in the ManagedSynonymParser, which is done to match the behavior of the current SynonymMap parser/builder. I've compared the results of the text analysis performed by the existing SynonymFilterFactory and the ManagedSynonymFilterFactory and they create the same tokens.