[SOLR-1614] Search in Hadoop - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Minor
Resolution: Won't Fix
Affects Version/s: 1.4
Fix Version/s: 3.2
Component/s: search
Labels:
None

Description

What's the use case? Sometimes queries are expensive (such as
regex) or one has indexes located in HDFS, that then need to be
searched on. By leveraging Hadoop, these non-time sensitive
queries may be executed without dynamically deploying the
indexes to new Solr servers.

We'll download the index out of HDFS (assuming they're zipped),
perform the queries in a batch on the index shard, then merge
the results either using a Solr query results priority queue, or
simply using Hadoop's built in merge sorting.

The query file will be encoded in JSON format, (ID, query,
numresults,fields). The shards file will simply contain newline
delimited paths (HDFS or otherwise). The output can be a Solr
encoded results file per query.

I'm hoping to add an actual Hadoop unit test.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jason Rutherglen

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 30/Nov/09 20:26

Updated:: 06/May/11 20:43

Resolved:: 24/Jan/11 21:12