Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Won't Fix
-
None
-
None
Description
Implementation plan:
Use MoreLikeThis queries on a SolrYard instance with topics indexed by aggregating the text of abstracts of all entities marked categorized by a given SKOS topic from DBpedia.
Such an index can be constructed using the pig scripts available at:
https://github.com/ogrisel/pignlproc/tree/master/examples/topic-corpus
or
https://github.com/ogrisel/dbpediakit
In order to perform MoreLikeThis queries using the SolrJ API it is possible to do the following:
#1 - Define the mlt handles in solrconfig.xml (it's not defined in the example
solrconfig.xml I was using):
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler" />
#2 - with Solrj, access the mlt handler via something similar to the following:
query.setQueryType("/" + MoreLikeThisParams.MLT);
query.set(MoreLikeThisParams.MATCH_INCLUDE, false);
query.set(MoreLikeThisParams.MIN_DOC_FREQ, 1);
query.set(MoreLikeThisParams.MIN_TERM_FREQ, 1);
query.set(MoreLikeThisParams.SIMILARITY_FIELDS, "subject,body");
query.setQuery("Your query here or in my case the unique key field:value");
Attachments
Issue Links
- is superceded by
-
STANBOL-1294 Topic Classification Framework for Stanbol
- Open
- relates to
-
STANBOL-92 package solr index for popular entities
- Closed
-
STANBOL-617 Define how TopicEnhancements are written to the Enhancement Structure
- Closed
-
STANBOL-28 Extend the RDF enhancement vocabulary to handle topics explicitly
- Closed