Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.9, 2.9.1, 3.0
-
None
-
None
-
Solr 1.4.0
-
New
Description
I have a Tokenizer that uses an external resource. I wrote this Tokenizer so that the external resource is released in its close() method.
This should work because close() is supposed to be called when the caller is done with the TokenStream of which Tokenizer is a subclass. TokenStream's API document <http://lucene.apache.org/java/2_9_1/api/core/org/apache/lucene/analysis/TokenStream.html> states:
6. The consumer calls close() to release any resource when finished using the TokenStream.
When I used my Tokenizer from Solr 1.4.0, it did not work as expected. An error analysis suggests an instance of my Tokenizer is used even after close() is called and the external resource is released. After a further analysis it seems that it is not Solr but Lucene itself that is breaking the contract.
This is happening in two places.
src/java/org/apache/lucene/queryParser/QueryParser.java:
protected Query getFieldQuery(String field, String queryText) throws ParseException {
// Use the analyzer to get all the tokens, and then build a TermQuery,
// PhraseQuery, or nothing based on the term count
TokenStream source;
try {
source = analyzer.reusableTokenStream(field, new StringReader(queryText));
source.reset();
.
.
.
try
src/java/org/apache/lucene/index/DocInverterPerField.java
public void processFields(final Fieldable[] fields,
final int count) throws IOException
finally
{ stream.close(); }Calling close() would be good if the TokenStream is not reusable one. But when it is reusable, it might be used again, so the resource associated with the TokenStream instance should not be released. close() needs to be called selectively only when it know it is not going to be reused.
Attachments
Issue Links
- is related to
-
SOLR-4872 Allow schema analysis object factories to be cleaned up properly when the core shuts down
- Open