Thanks for the thorough feedback, Yonik. Some thoughts inline.
- units: if we do specify units, more standard abbreviations would probably be "km" and "mi" rather than "K" and "M"
- just standardizing on units might be the best option, as I think it would be simplest to use (and client conversion is trivial)
- Examples of confusion and complications that "units" can case:
- When filtering and sorting, what happens if there is a units mismatch?
- does "units" apply to the optional "radius" argument?
- what are the units for calculated and returned distances? People might assume that because they filtered using "km", that a distance sort or other function queries would be in "km" or "mi"
- if returned distances can be in any unit, it complicates client code that may not know the exact request, and hence the units
- boosting or adding in a function of distance to the relevancy score: changing the units would unexpectedly change how this worked.
Agreed on abbreviations. The units mostly come into play with picking the radius. Lucene spatial seems to assume miles in a lot of places, but I imagine this is actually kind of silly given that the majority of the world uses metric. I guess ideally, I want people to be able to save on conversion, but it's probably silly to try to engineer that in. I suppose we could have a whole discussion on whether to assume English or Metric. I'm still on the fence.
fl should probably just be "f" (fl stands for "field list")? Update: looking at the code again, it seems that the idea is to allow specifying lat,lon separately too?
we have point and latlon fields for this though... shouldn't it always just be a single field?
I was thinking that I might support multiple fields for those people who are already indexing lat/lon as two fields. That way they could get the benefits of this code w/o re-indexing. That being said, it is still unimplemented and I'm not sure how to implement it.
"radius" parameter is likely to confuse people... (i.e. I specified radius=1 because I wanted to search within a 1 mile radius), while in reality this represents the radius of the earth. Since 99.99% of people should not use this parameter, perhaps rename to "planet_radius" or "sphere_radius"?
sphere_radius it is.
Seems like it would be really nice to be able to do both distance filtering and bounding box filtering.
- Distance filtering would normally be implemented as a combination of a bounding box + distance calculations for points within that box.
- Distance filtering could get even more efficient in the future... we could also calculate an "inner" box within the bounding box.
Any points in the inner box would be guaranteed to be less than the distance, hence no need to calculate exact distance.
- It looks like the current implementation(s) does bounding box only?
LatLonType will implement bounding box + distance filtering. I imagine people will want to override the FieldTypes for their specifics on creating the Filter. Right now, PointType really is designed for a rectangular coordinate system. Most people will likely use LatLonType, which will be geared towards spherical (i.e. Earth). But, agreed, we will be able to optimize over time.
is there a reason why it's ResourceLoaderAware?
Artifact of earlier design. Removed.
Already changed and "higher" level, just haven't posted the patch. I actually renamed it too.
oops, I see that this is already done under "sfilt" currently - so it can just be removed from solrconfig.xml
before release, we'll prob want to cut down the spatial field types to just one recommended one that people can use w/o having to figure out the difference between all the types. OK to keep multiple in during development though... makes for easier ad hoc testing.
For now, I have a comment in the schema that says something to the effect that you likely will not need all of these. I think this gets to a bigger discussion we should have on dev@ about what the example should be. I'd like the example to be more self contained and more self explanatory. We could have a light version and a full featured one.
With an eye toward the end game, here's an example of what we might want to shoot for:
Agreed the syntax is nice. I don't think I will do that as part of this patch. Otherwise, I think all of those use cases are covered.