Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12674

RollupStream should not use the HashQueryParser for 1 worker

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 7.5
    • None
    • None

    Description

      Let's say I have a dataset of a 100M document

      After applying a filter the number of documents found would be 5k so it's tiny.

      If I do a search and a rollup stream the query returns in the 200ms range.

      But if by mistake I add the "partitionKeys" param to the search stream the hash query parser is invoked which runs on the entire document set and the query time spikes up to 7 seconds. 

      If we aren't providing a parallel stream we should ignore the partitionKeys param

      Sample Query:

      rollup(search(gettingstarted,q="*:*",fl="id",sort="id desc",partitionKeys="id"),over="id")

      Because of the partitionKeys the underlying query formed is:

      params={q=*:*&distrib=false&fl=id&sort=id+desc&partitionKeys=id&fq={!hash+workers%3D1+worker%3D0}&wt=json&version=2.2} hits=2 status=0 QTime=30

      This is a dummy dataset so don't see the the hits and QTime but this query certainly doesn't need to add the hash query parser filter clause for workers=1

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            varun Varun Thacker
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment