Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2780

optimize spantermquery

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 4.9, 6.0
    • core/search
    • None
    • New, Patch Available

    Description

      Looking at http://www.lucidimagination.com/search/document/c2c6f660ddde4f7f/dismaxqparserplugin_and_tokenization ,
      I saw a user building DisjunctionMaxQuery / BooleanQuery with SpanTermQuerys.

      I wonder if users know that doing this is much slower than just using TermQuery?
      I agree it makes little sense to use SpanTermQuery if you arent going to use it inside a SpanNear etc,
      but on the other hand, I think its a little non-intuitive that it wouldnt be just as fast in a case like this.

      I could see this complicating queryparsing etc for users that want to sometimes use positions etc.

      SpanTermQuery is the same as TermQuery, except tf is computed as (#of spans * sloppyFreq(spanLength)
      For this case, #ofspans = tf and spanLength for a single term is always 1.

      Maybe we should optimize SpanTermQuery to return TermScorer, with just this special tf computation.
      This would avoid reading positions for anyone that does this.

      Attachments

        1. LUCENE-2780.patch
          4 kB
          Robert Muir

        Activity

          People

            Unassigned Unassigned
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: