Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-3799

Federated search support - include documents from external collections in Solr search results.




      Following discussion on dev@lucene.apache.org (http://mail-archives.apache.org/mod_mbox/lucene-dev/201205.mbox/%3CABA41FBE-72A8-467E-BF33-3D0CA1ED81DD@cominvent.com%3E , http://mail-archives.apache.org/mod_mbox/lucene-dev/201208.mbox/%3C1345800198.2303.68.camel@oo%3E) i would like to introduce the idea of Federated Search in Solr.

      It would be nice to have support for real Federated Search in Solr - very helpfull for people who would like to include some external search results in their Solr-based system.

      By Federated Search i mean searching across not only distributed Solr instances (existing DistributedSearch in Solr) but also other kind of external search services.

      Typical federated search process includes:

      • collection selection step
      • results merging
      • adapters for external collections connection
      • collections representations (used in collection selection and/or result merging)

      I'm thinking about creating full solution with basic example implementation of each module.

      Things to do that comes to my mind are:
      1. federated request support in SearchHandler: the place where everything is tight up.
      2. CollectionSelectionComponent: which should be independent, so one can use it separately.
      3. federated search support in QueryComponent: with no hard-coded agorithms if it's possible.
      4. Results merging rules module: as pluggable part of QueryComponent or as separate MergingComponent.
      5. Adapter (connector) to external collection: interface and example implementation.
      6. Collections representation: interface and default implementation: Used to store informations about indexes/collections.

      The typical use case would look like this:

      • user sends search request
      • Solr decides to which indexes delegate the request (collection selection): for example by comparing user's query with collection representations.
      • Solr decides how many and which documents get from each collection (merge rules): for example by using previous step results.
      • Solr sends user's query to collections (Solr instances and/or external collections through dedicated adapters)
      • Solr merges and retuns the results.

      Design requirements:

      • lightweight implementation
      • designed as Solr feature, not as something on top of Solr or as Solr extension
      • easy to use and customize out of the box
      • allow for extension/reimplementation by users

      Any suggestions/discussions welcome!




            plebanek Jacek Plebanek
            plebanek Jacek Plebanek
            0 Vote for this issue
            2 Start watching this issue