Description
The analyze Stream Evaluator uses a Solr analyzer to return a collection of tokens from a text field. The collection of tokens can then be streamed out by the cartesianProduct Streaming Expression or attached to documents as multi-valued fields by the select Streaming Expression.
This allows Streaming Expressions to leverage all the existing tokenizers and filters and provides a place for future NLP analyzers to be added to Streaming Expressions.
Sample syntax:
cartesianProduct(expr, analyze(analyzerField, textField) as outfield )
select(expr, analyze(analyzerField, textField) as outfield )
Combined with Solr's batch text processing capabilities this provides an entire parallel NLP framework. Solr's batch processing capabilities are described here:
Batch jobs, Parallel ETL and Streaming Text Transformation
http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html