Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-3171

BlockJoinQuery/Collector

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4, 4.0-ALPHA
    • Component/s: modules/other
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I created a single-pass Query + Collector to implement nested docs.
      The approach is similar to LUCENE-2454, in that the app must index
      documents in "join order", as a block (IW.add/updateDocuments), with
      the parent doc at the end of the block, except that this impl is one
      pass.

      Once you join at indexing time, you can take any query that matches
      child docs and join it up to the parent docID space, using
      BlockJoinQuery. You then use BlockJoinCollector, which sorts parent
      docs by provided Sort, to gather results, grouped by parent; this
      collector finds any BlockJoinQuerys (using Scorer.visitScorers) and
      retains the child docs corresponding to each collected parent doc.

      After searching is done, you retrieve the TopGroups from a provided
      BlockJoinQuery.

      Like LUCENE-2454, this is less general than the arbitrary joins in
      Solr (SOLR-2272) or parent/child from ElasticSearch
      (https://github.com/elasticsearch/elasticsearch/issues/553), since you
      must do the join at indexing time as a doc block, but it should be
      able to handle nested joins as well as joins to multiple tables,
      though I don't yet have test cases for these.

      I put this in a new Join module (modules/join); I think as we
      refactor join impls we should put them here.

        Attachments

        1. LUCENE-3171.patch
          72 kB
          Michael McCandless
        2. LUCENE-3171.patch
          67 kB
          Michael McCandless
        3. LUCENE-3171.patch
          52 kB
          Michael McCandless

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                mikemccand Michael McCandless
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: