Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, 5.0
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Looking at http://www.lucidimagination.com/search/document/c2c6f660ddde4f7f/dismaxqparserplugin_and_tokenization ,
      I saw a user building DisjunctionMaxQuery / BooleanQuery with SpanTermQuerys.

      I wonder if users know that doing this is much slower than just using TermQuery?
      I agree it makes little sense to use SpanTermQuery if you arent going to use it inside a SpanNear etc,
      but on the other hand, I think its a little non-intuitive that it wouldnt be just as fast in a case like this.

      I could see this complicating queryparsing etc for users that want to sometimes use positions etc.

      SpanTermQuery is the same as TermQuery, except tf is computed as (#of spans * sloppyFreq(spanLength)
      For this case, #ofspans = tf and spanLength for a single term is always 1.

      Maybe we should optimize SpanTermQuery to return TermScorer, with just this special tf computation.
      This would avoid reading positions for anyone that does this.

        Activity

        Hide
        Robert Muir added a comment -

        patch

        Show
        Robert Muir added a comment - patch
        Hide
        Michael McCandless added a comment -

        Looks good Robert! It's a sneaky trap. Maybe add a comment to createWeight explaining that this is only used when a "normal" (non-span) Query embeds a SpanTermQuery?

        Someday we need to merge Span* back into the "normal" queries.

        Show
        Michael McCandless added a comment - Looks good Robert! It's a sneaky trap. Maybe add a comment to createWeight explaining that this is only used when a "normal" (non-span) Query embeds a SpanTermQuery? Someday we need to merge Span* back into the "normal" queries.
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development