Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5779

Improve BBox AreaSimilarity algorithm to consider lines and points

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: modules/spatial
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      GeoPortal's area overlap algorithm didn't consider lines and points; they end up turning the score 0. I've thought about this for a bit and I've come up with an alternative scoring algorithm. (already coded and tested and documented):
      New Javadocs:

      /**
       * The algorithm is implemented as envelope on envelope overlays rather than
       * complex polygon on complex polygon overlays.
       * <p/>
       * <p/>
       * Spatial relevance scoring algorithm:
       * <DL>
       *   <DT>queryArea</DT> <DD>the area of the input query envelope</DD>
       *   <DT>targetArea</DT> <DD>the area of the target envelope (per Lucene document)</DD>
       *   <DT>intersectionArea</DT> <DD>the area of the intersection between the query and target envelopes</DD>
       *   <DT>queryTargetProportion</DT> <DD>A 0-1 factor that divides the score proportion between query and target.
       *   0.5 is evenly.</DD>
       *
       *   <DT>queryRatio</DT> <DD>intersectionArea / queryArea; (see note)</DD>
       *   <DT>targetRatio</DT> <DD>intersectionArea / targetArea; (see note)</DD>
       *   <DT>queryFactor</DT> <DD>queryRatio * queryTargetProportion;</DD>
       *   <DT>targetFactor</DT> <DD>targetRatio * (1 - queryTargetProportion);</DD>
       *   <DT>score</DT> <DD>queryFactor + targetFactor;</DD>
       * </DL>
       * Note: The actual computation of queryRatio and targetRatio is more complicated so that it considers
       * points and lines. Lines have the ratio of overlap, and points are either 1.0 or 0.0 depending on wether
       * it intersects or not.
       * <p />
       * Based on Geoportal's
       * <a href="http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java">
       *   SpatialRankingValueSource</a> but modified. GeoPortal's algorithm will yield a score of 0
       * if either a line or point is compared, and it's doesn't output a 0-1 normalized score (it multiplies the factors).
       *
       * @lucene.experimental
       */
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dsmiley David Smiley
                Reporter:
                dsmiley David Smiley
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: