Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4445

Join queries should execute their "from" queries on all shards

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.1
    • None
    • query parsers, SolrCloud
    • None

    Description

      When running join queries on a collection with multiple shards, the "from" side of the query is executed on the shard that serves the request only, instead of on all shards. The matching documents are then passed to the "to" side of the query. This leads to the overall result set being a subset of what it would be, if the join query were run on a collection with only one shard.

      That is, a four-shard collection will, on average, return 25% of the results a single-shard collection would.

      The code should execute the "from" side of the query on all available shards before passing those matching documents to the "to" side of the query.

      Note: LUCENE-3759 proposes an upgrade to JoinUtil to support joining when the documents matched by the "from" side of the query exist on multiple shards. Solr does not use that class for joining (nor does anything else?), so this would have to be implemented separately.

      Attachments

        Activity

          People

            Unassigned Unassigned
            cbartolo Colin Bartolome
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: