And I also can't see anyone really spending time to aggressively ensure that the example schema etc is all up to date
I think you are vastly underestimating how much work is spent reviewing the example schema.xml prior to releases. It would be trivial to search/replace luceneMatchVersion="X" with luceneMatchVersion="Y" anytime the "current" version of Version was updated in Lucene-Java
the hardcoded 2.4 behavior is the action at a distance, because if i do not specify Version in my configuration file, then i get this very old behavior.
I don't follow you at all – you have identified no action, or distance in your example.
When i say i'm worried about scary action at a distance, i'm talking about editing some thing A in a config file, and having it result in changed behavior (action) in things B, C and D that do not directly refer to A in any way (distance). Further more these changes in behavior are silent (thus scary).
If I have <fieldType name="A"/> and much later in the config <field name="B" type="A"/> the editing A results in and action on B at a distance – but this should not suprise me at all because B explicitly refrences A.
Having a global <luceneMatchVersion/> tag that affects the behavior of a variety of different things when it's modified leads to situations where people might change that value triggering changes in many components w/o a clear idea of what might have changed – so they don't even know what things they should focus on testing for correctness after makign that change.
The existing <schema version="X"/> property also leads to action at a distance type situations – but that is a lot less scary to me because at least with it there is a uniform set of changes to all schema objects between any two versions, so it's easy to document what cahnges when you go from 1.1 to 1.2, or 1.2 to 1.3 ... but with luceneMatchVersion the potential changes are unique to every individual Class that cares about it.
If this is really your concern, then i have an alternative i propose.
- No default anywhere, not even in the code
- Version is mandatory if the thing requires it
This is something Uwe and i both discussed in previous comments...
...as i said: i'm fine with this idea in theory – as a long term plan – but there has to be a gradual migration process for people. ie: it can be required on certain objects in a future release, but for at least the next release it needs to be possible to not specify the luceneMatchVersion on all of these objects, and when people use them w/o specifying, they can log big fat warnings on initi that it is defaulting to 2.4, and they should set the property explicitly if that's what they want.
I still do not want it in schema.xml, as Version is a global Lucene thing!
Uwe: I think you are missunderstanding the reason for a distinction between solrconfig.xml and schema.xml in Solr. If (for hte sake of argument) luceneMatchVersion really should be a "global Lucene thing" then that is precisely why it should be in schema.xml.
schema.xml is for configuration that is inheriently part of the index, and must be consistent regardless of who/how/why that index is being used. solrconfig.xml is where settings are put that are specific to how a a particular instance of an index is being used. If a setting is in solrconfig.xml, then it should to be possible for that setting to be completley different on differnet solr instances that use the exact same schema.xml – even if they use cloned copies of the same index directory. (ie: master/slave distinctions in replication; peer slaves with distinct handler/cache settings to serve distinct use cases; etc...)
That's the reason why nothing that hangs off of IndexSchema is currently allowed to be SolrCoreAware, or get access to the SolrConfig object (and the SolrResourceLoader abstraction was created) ... nothing about the SolrCore "instance" should be allowed to influence the resulting index, because that index may later be used on a differnet instance with a different config.
As i mentioned before: solrconfig.xml can depend on schema.xml, but schema.xml can not depend on solrconfig.xml
So if a global luceneMatchVersion can affect the behavior of an analyzer or FieldType in a way that is "persisted" as part of hte index – and other classes (like QueryParser in Robert's example) need to make sure to use the same luceneMatchVersion to behave correctly with that index, then that setting needs to be in the schema.xml so it is consistent no matter how/where that index and schema.xml file are used.
Does that make sense?
I'd still like to clarify this whole issue of wether "Lucene-Java", as a project, has an expectation that client applications will always use a consistent value for Version when constructing objects that interact with an index, as Robert alluded to in a previous comment...
I don't think Version is intended so you can use X.Y on this part and Y.Z on this part
This was not my impression when Version was added – but i freely admit I wasn' paying that much attention.
In Uwe's comment he implied (but didn't actually state) that he concurred with Robert...
...Version is a global Lucene thing...
Iff that expectation really is true in Lucnee-Java, and iff there really is an expectation that using multiple Version values withing Solr is likely to cause people problems as objects interact, then it seems to be that it be a very bad idea to offer to any sort of out of the box support for per object overriding of luceneMatchVersion in our solrconfig.xml/schema.xml.
i know, i know ... this is a complete 180 from my previous claim that we should only have per object configuration – a claim that i still stand behind if Lucene-Java "supports" applications using multiple values of Version, but if that is not considered "supported" and if changes are actively being made in Lucene-Java that explicitly assume consistent Version usage, then I'm not convinced it owuld be a good idea to enable people to tweak things in that way. Anyone who understands the underlying Java code enough to appreciate the nuances of using A.B in one place and X.Y in another place can write their own Factory that looks at a luceneMatchVersion nit param – the out of hte box ones should stick with the global setting.
BUT!!!!! ... those are Big "IFFs" ...
- Uwe: do you concur with Robert?
- Are there any threads/docs about the expecations of Version homo/hetero-genousness in Lucene-Java?