Solr
  1. Solr
  2. SOLR-1321

Support for efficient leading wildcards search

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      This patch is an implementation of the "reversed tokens" strategy for efficient leading wildcards queries.

      ReversedWildcardsTokenFilter reverses tokens and returns both the original token (optional) and the reversed token (with positionIncrement == 0). Reversed tokens are prepended with a marker character to avoid collisions between legitimate tokens and the reversed tokens - e.g. "DNA" would become "and", thus colliding with the regular term "and", but with the marker character it becomes "\u0001and".

      This TokenFilter can be added to the analyzer chain that it used during indexing.

      SolrQueryParser has been modified to detect the presence of such fields in the current schema, and treat them in a special way. First, SolrQueryParser examines the schema and collects a map of fields where these reversed tokens are indexed. If there is at least one such field, it also sets QueryParser.setAllowLeadingWildcards(true). When building a wildcard query (in getWildcardQuery) the term text may be optionally reversed to put wildcards further along the term text. This happens when the field uses the reversing filter during indexing (as detected above), AND if the wildcard characters are either at 0-th or 1-st position in the term. Otherwise the term text is processed as before, i.e. turned into a regular wildcard query.

      Unit tests are provided to test the TokenFilter and the query parsing.

      1. wildcards.patch
        15 kB
        Andrzej Bialecki
      2. wildcards-2.patch
        19 kB
        Andrzej Bialecki
      3. wildcards-3.patch
        20 kB
        Andrzej Bialecki
      4. SOLR-1321.patch
        22 kB
        Grant Ingersoll
      5. SOLR-1321.patch
        23 kB
        Grant Ingersoll
      6. SOLR-1321.patch
        23 kB
        Robert Muir

        Activity

        Grant Ingersoll made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Grant Ingersoll made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Robert Muir made changes -
        Attachment SOLR-1321.patch [ 12419264 ]
        Grant Ingersoll made changes -
        Attachment SOLR-1321.patch [ 12419230 ]
        Grant Ingersoll made changes -
        Attachment SOLR-1321.patch [ 12419178 ]
        Andrzej Bialecki made changes -
        Attachment wildcards-3.patch [ 12419095 ]
        Grant Ingersoll made changes -
        Assignee Grant Ingersoll [ gsingers ]
        Andrzej Bialecki made changes -
        Attachment wildcards-2.patch [ 12415416 ]
        Andrzej Bialecki made changes -
        Attachment wildcards-2.patch [ 12415404 ]
        Andrzej Bialecki made changes -
        Attachment wildcards-2.patch [ 12415404 ]
        Andrzej Bialecki made changes -
        Field Original Value New Value
        Attachment wildcards.patch [ 12415125 ]
        Andrzej Bialecki created issue -

          People

          • Assignee:
            Grant Ingersoll
            Reporter:
            Andrzej Bialecki
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development