Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
6.0
Description
Following discussion on dev@lucene.apache.org (http://mail-archives.apache.org/mod_mbox/lucene-dev/201205.mbox/%3CABA41FBE-72A8-467E-BF33-3D0CA1ED81DD@cominvent.com%3E , http://mail-archives.apache.org/mod_mbox/lucene-dev/201208.mbox/%3C1345800198.2303.68.camel@oo%3E) i would like to introduce the idea of Federated Search in Solr.
It would be nice to have support for real Federated Search in Solr - very helpfull for people who would like to include some external search results in their Solr-based system.
By Federated Search i mean searching across not only distributed Solr instances (existing DistributedSearch in Solr) but also other kind of external search services.
Typical federated search process includes:
- collection selection step
- results merging
- adapters for external collections connection
- collections representations (used in collection selection and/or result merging)
I'm thinking about creating full solution with basic example implementation of each module.
Things to do that comes to my mind are:
1. federated request support in SearchHandler: the place where everything is tight up.
2. CollectionSelectionComponent: which should be independent, so one can use it separately.
3. federated search support in QueryComponent: with no hard-coded agorithms if it's possible.
4. Results merging rules module: as pluggable part of QueryComponent or as separate MergingComponent.
5. Adapter (connector) to external collection: interface and example implementation.
6. Collections representation: interface and default implementation: Used to store informations about indexes/collections.
The typical use case would look like this:
- user sends search request
- Solr decides to which indexes delegate the request (collection selection): for example by comparing user's query with collection representations.
- Solr decides how many and which documents get from each collection (merge rules): for example by using previous step results.
- Solr sends user's query to collections (Solr instances and/or external collections through dedicated adapters)
- Solr merges and retuns the results.
Design requirements:
- lightweight implementation
- designed as Solr feature, not as something on top of Solr or as Solr extension
- easy to use and customize out of the box
- allow for extension/reimplementation by users
Any suggestions/discussions welcome!