Solr
  1. Solr
  2. SOLR-6183

Add spatial BBoxField using BBoxSpatialStrategy

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: spatial
    • Labels:
      None

      Description

      This introduces a new BBoxField configured like so:

          <fieldType name="bbox" class="solr.BBoxField"
                     numberType="tdouble" units="degrees"/>
      

      It's a field type based on the same backing as the other Solr 4 spatial field types (namely RPT) and thus it inherits the same way to use it, plus what is unique to this field. Ideally, the numberType would point to double based field type configured with docValues=true but that is not required. Only TrieDouble no float yet.

      This strategy only accepts indexed rectangles and querying by a rectangle. Indexing a rectangle requires WKT:
      ENVELOPE(-10, 20, 15, 10) which is minX, maxX, maxY, minY (yeah, that 'y' order is wacky but it's not my spec). This year I hope to add indexing ['lat,lon' TO 'lat,lon'] but it's not in there yet.

      To query using it's special area overlap ranking, you have to use the special 'score' local-param with a new value like so:
      q={!field f=bbox score=overlapRatio queryTargetProportion=0.25}Intersects(ENVELOPE(10,25,12,10))

      The queryTargetProportion defaults to 0.25 to be roughly what GeoPortal uses (although GeoPortal actually has a different formula). This default weights 1 part query factor to 3 parts target factor.

      Add debug=results to see useful "explain" info.

      1. SOLR-6183__BBoxFieldType.patch
        14 kB
        David Smiley
      2. SOLR-6183__BBoxFieldType.patch
        15 kB
        David Smiley
      3. SOLR-6183__BBoxFieldType.patch
        13 kB
        David Smiley

        Issue Links

          Activity

          Hide
          David Smiley added a comment -

          New patch:

          • Syncs with the Lucene-spatial side in LUCENE-5714; in particular, docValues and wether to index or not are now supported. Still limited to doubles though.
          • New score=area & score=area2D options to return the area of the indexed shape. It's generally computed geodetically, but area2D uses simple & fast math (simply width * height) which is usually plenty good enough.
          • score=overlapRatio is the new name for the former areaOverlap (or whatever I called it) algorithm. And it has a new minSideLength local-param option.
          Show
          David Smiley added a comment - New patch: Syncs with the Lucene-spatial side in LUCENE-5714 ; in particular, docValues and wether to index or not are now supported. Still limited to doubles though. New score=area & score=area2D options to return the area of the indexed shape. It's generally computed geodetically, but area2D uses simple & fast math (simply width * height) which is usually plenty good enough. score=overlapRatio is the new name for the former areaOverlap (or whatever I called it) algorithm. And it has a new minSideLength local-param option.
          Hide
          ASF subversion and git services added a comment -

          Commit 1608991 from David Smiley in branch 'dev/trunk'
          [ https://svn.apache.org/r1608991 ]

          SOLR-6183: AbstractSpatialFieldType: refactor getValueSourceFromSpatialArgs out of getQueryFromSpatialArgs

          Show
          ASF subversion and git services added a comment - Commit 1608991 from David Smiley in branch 'dev/trunk' [ https://svn.apache.org/r1608991 ] SOLR-6183 : AbstractSpatialFieldType: refactor getValueSourceFromSpatialArgs out of getQueryFromSpatialArgs
          Hide
          ASF subversion and git services added a comment -

          Commit 1608992 from David Smiley in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1608992 ]

          SOLR-6183: AbstractSpatialFieldType: refactor getValueSourceFromSpatialArgs out of getQueryFromSpatialArgs

          Show
          ASF subversion and git services added a comment - Commit 1608992 from David Smiley in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1608992 ] SOLR-6183 : AbstractSpatialFieldType: refactor getValueSourceFromSpatialArgs out of getQueryFromSpatialArgs
          Hide
          David Smiley added a comment -

          Slightly updated patch showing addition to the default schema.xml to let people be aware of it. And it's got a couple micro refactorings as well.

          FYI the tests are meager because it's tested at the Lucene layer.

          Show
          David Smiley added a comment - Slightly updated patch showing addition to the default schema.xml to let people be aware of it. And it's got a couple micro refactorings as well. FYI the tests are meager because it's tested at the Lucene layer.
          Hide
          ASF subversion and git services added a comment -

          Commit 1608998 from David Smiley in branch 'dev/trunk'
          [ https://svn.apache.org/r1608998 ]

          SOLR-6183: Spatial BBoxField using BBoxSpatialStrategy

          Show
          ASF subversion and git services added a comment - Commit 1608998 from David Smiley in branch 'dev/trunk' [ https://svn.apache.org/r1608998 ] SOLR-6183 : Spatial BBoxField using BBoxSpatialStrategy
          Hide
          ASF subversion and git services added a comment -

          Commit 1609291 from David Smiley in branch 'dev/trunk'
          [ https://svn.apache.org/r1609291 ]

          SOLR-6183: bug, BBoxField didn't propagate docValues configuration.
          And numberType is now a required attribute.

          Show
          ASF subversion and git services added a comment - Commit 1609291 from David Smiley in branch 'dev/trunk' [ https://svn.apache.org/r1609291 ] SOLR-6183 : bug, BBoxField didn't propagate docValues configuration. And numberType is now a required attribute.
          Hide
          ASF subversion and git services added a comment -

          Commit 1609303 from David Smiley in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1609303 ]

          SOLR-6183: Spatial BBoxField using BBoxSpatialStrategy

          Show
          ASF subversion and git services added a comment - Commit 1609303 from David Smiley in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1609303 ] SOLR-6183 : Spatial BBoxField using BBoxSpatialStrategy
          Hide
          David Smiley added a comment -

          Done. BTW the schema.xml modification is as follows:

              <!-- Spatial rectangle (bounding box) field. It supports most spatial predicates, and has
               special relevancy modes: score=overlapRatio|area|area2D (local-param to the query).  DocValues is required for
               relevancy. -->
              <fieldType name="bbox" class="solr.BBoxField"
                  geo="true" units="degrees" numberType="_bbox_coord" />
              <fieldType name="_bbox_coord" class="solr.TrieDoubleField" precisionStep="8" docValues="true" stored="false"/>
          

          On trunk/5x I was tricked into thinking I passed the Lucene FieldType info from the Solr layer into BBoxStrategy correctly, but I wasn't with respect to DocValues. Tests were working because in 5x Solr uninverts automatically in the absence of DocValues. On 4x, DocValues is a requirement for any distance/overlap/area related relevancy. I'm not interested in adding complexity to switch between FieldCache vs DocValues APIs. On trunk DocValues is recommended but not necessary.

          TODO: Documentation in the ref guide.

          Show
          David Smiley added a comment - Done. BTW the schema.xml modification is as follows: <!-- Spatial rectangle (bounding box) field. It supports most spatial predicates, and has special relevancy modes: score=overlapRatio|area|area2D (local-param to the query). DocValues is required for relevancy. --> <fieldType name= "bbox" class= "solr.BBoxField" geo= "true" units= "degrees" numberType= "_bbox_coord" /> <fieldType name= "_bbox_coord" class= "solr.TrieDoubleField" precisionStep= "8" docValues= "true" stored= "false" /> On trunk/5x I was tricked into thinking I passed the Lucene FieldType info from the Solr layer into BBoxStrategy correctly, but I wasn't with respect to DocValues. Tests were working because in 5x Solr uninverts automatically in the absence of DocValues. On 4x, DocValues is a requirement for any distance/overlap/area related relevancy. I'm not interested in adding complexity to switch between FieldCache vs DocValues APIs. On trunk DocValues is recommended but not necessary. TODO: Documentation in the ref guide.

            People

            • Assignee:
              David Smiley
              Reporter:
              David Smiley
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development