Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-3426

optimizer for n-gram PhraseQuery

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 2.9.4, 3.0.3, 3.1, 3.2, 3.3, 3.4, 4.0-ALPHA
    • 3.5, 4.0-ALPHA
    • core/search
    • None
    • New

    Description

      If 2-gram is used and the length of query string is 4, for example q="ABCD", QueryParser generates (when autoGeneratePhraseQueries is true) PhraseQuery("AB BC CD") with slop 0. But it can be optimized PhraseQuery("AB CD") with appropriate positions.

      The idea came from the Japanese paper "N.M-gram: Implementation of Inverted Index Using N-gram with Hash Values" by Mikio Hirabayashi, et al. (The main theme of the paper is different from the idea that I'm using here, though)

      Attachments

        1. LUCENE-3426.patch
          2 kB
          Koji Sekiguchi
        2. LUCENE-3426.patch
          4 kB
          Koji Sekiguchi
        3. PerfTest.java
          5 kB
          Koji Sekiguchi
        4. LUCENE-3426.patch
          7 kB
          Koji Sekiguchi
        5. LUCENE-3426.patch
          7 kB
          Koji Sekiguchi
        6. PerfTest.java
          5 kB
          Koji Sekiguchi
        7. LUCENE-3426.patch
          8 kB
          Koji Sekiguchi
        8. LUCENE-3426.patch
          8 kB
          Koji Sekiguchi

        Activity

          People

            koji Koji Sekiguchi
            koji Koji Sekiguchi
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: