Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2218

Performance of start= and rows= parameters are exponentially slow with large data sets

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.4.1
    • None
    • Build
    • None

    Description

      With large data sets, > 10M rows.

      Setting start=<large number> and rows=<large numbers> is slow, and gets slower the farther you get from start=0 with a complex query. Random also makes this slower.

      Would like to somehow make this performance faster for looping through large data sets. It would be nice if we could pass a pointer to the result set to loop, or support very large rows=<number>.

      Something like:
      rows=1000
      start=0
      spointer=string_my_query_1

      Then within interval (like 5 mins) I can reference this loop:
      Something like:
      rows=1000
      start=1000
      spointer=string_my_query_1

      What do you think? Since the data is too great the cache is not helping.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              billnbell Bill Bell
              Votes:
              2 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: