Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Done
-
Jena 3.6.0
-
None
Description
This issue proposes an improvement to jena-text to include optional highlighting of results via:
org.apache.lucene.search.highlight.Highlighter
and
org.apache.lucene.search.highlight.SimpleHTMLFormatter
The improvement will add an optional input argument to TextQueryPF that signals that highlighting should be performed on the Lucene search results; optionally indicates the start and end char sequences of a highlighted term; optionally indicates the maximum number of fragments to highlight; and optionally indicates a fragment separator.
The highlighted results are bound to the ?literal output argument of TextQueryPF.
Inclusion of this improvement will introduce a simple extraction of the highlight option string and a single test for the presence of the option string so that the inclusion of the improvement will be of minimal impact when highlighting is not used. The highlight option string is passed directly to TextIndex.query(...) and so can be used from code other than TextQueryPF.
The simplest use of highlighting is like:
select ?s ?lit where { (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight:") . }
which will produce results such as:
"another ↦one↤ abc"@en
the right-arrow (\u21a6) and left-arrow (\u21a4) are the default start and end highlighting character sequences. These are chosen to be very unlikely to occur in literals. These can be changed easily via "s:" and "e:" in the highlight options, for example:
select ?s ?lit where { (?s ?sc ?lit) text:query (skos:prefLabel "one" 100 "lang:en" "highlight: s:<em class='hilite'> | e:</em>") .
which will produce results such as:
"another <em class='hilite'>one</em> abc"@en
Coding of this improvement is complete and a PR can be issued if there is agreement that this improvement should be included in jena-text.
Attachments
Issue Links
- links to