[LUCENE-6339] [suggest] Near real time Document Suggester - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 5.0
Fix Version/s: 5.1, 6.0
Component/s: core/search
Labels:
None

Lucene Fields:

New

Description

The idea is to index documents with one or more SuggestField(s) and be able to suggest documents with a SuggestField value that matches a given key.
A SuggestField can be assigned a numeric weight to be used to score the suggestion at query time.

Document suggestion can be done on an indexed SuggestField. The document suggester can filter out deleted documents in near real-time. The suggester can filter out documents based on a Filter (note: may change to a non-scoring query?) at query time.

A custom postings format (CompletionPostingsFormat) is used to index SuggestField(s) and perform document suggestions.

Usage

  // hook up custom postings format
  // indexAnalyzer for SuggestField
  Analyzer analyzer = ...
  IndexWriterConfig config = new IndexWriterConfig(analyzer);
  Codec codec = new Lucene50Codec() {
    PostingsFormat completionPostingsFormat = new Completion50PostingsFormat();

    @Override
    public PostingsFormat getPostingsFormatForField(String field) {
      if (isSuggestField(field)) {
        return completionPostingsFormat;
      }
      return super.getPostingsFormatForField(field);
    }
  };
  config.setCodec(codec);
  IndexWriter writer = new IndexWriter(dir, config);
  // index some documents with suggestions
  Document doc = new Document();
  doc.add(new SuggestField("suggest_title", "title1", 2));
  doc.add(new SuggestField("suggest_name", "name1", 3));
  writer.addDocument(doc)
  ...
  // open an nrt reader for the directory
  DirectoryReader reader = DirectoryReader.open(writer, false);
  // SuggestIndexSearcher is a thin wrapper over IndexSearcher
  // queryAnalyzer will be used to analyze the query string
  SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, queryAnalyzer);
  
  // suggest 10 documents for "titl" on "suggest_title" field
  TopSuggestDocs suggest = indexSearcher.suggest("suggest_title", "titl", 10);

Indexing

Index analyzer set through IndexWriterConfig

SuggestField(String name, String value, long weight)

Query

Query analyzer set through SuggestIndexSearcher.
Hits are collected in descending order of the suggestion's weight

// full options for TopSuggestDocs (TopDocs)
TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter)

// full options for Collector
// note: only collects does not score
void suggest(String field, CharSequence key, int num, Filter filter, TopSuggestDocsCollector collector)

Analyzer

CompletionAnalyzer can be used instead to wrap another analyzer to tune suggest field only parameters.

CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-6339.patch
27/Mar/15 20:44
119 kB
Areek Zillur
LUCENE-6339.patch
27/Mar/15 16:43
117 kB
Areek Zillur
LUCENE-6339.patch
26/Mar/15 00:29
116 kB
Areek Zillur
LUCENE-6339.patch
25/Mar/15 23:55
116 kB
Areek Zillur
LUCENE-6339.patch
16/Mar/15 04:53
110 kB
Areek Zillur
LUCENE-6339.patch
07/Mar/15 00:02
108 kB
Areek Zillur
LUCENE-6339.patch
04/Mar/15 22:55
109 kB
Areek Zillur

Activity

People

Assignee:: Areek Zillur

Reporter:: Areek Zillur

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 04/Mar/15 22:46

Updated:: 28/Aug/22 14:28

Resolved:: 28/Mar/15 00:28