XML documents contain textual content and the need to search large corpora of documents is a fairly common task. The intent of this project is to leverage Apache Lucene's indexing and search capability so that users of the VXQuery engine can express and run text-search queries.
This project will have two parts.
1. Design and implement the ability for users to create and manage text indices on collections of XML documents.
2. Implement functions in VXQuery to exploit these text-indices to execute relevant queries efficiently.
As a starting point, the system does not need to do automatic index selection (decide to use the text index when the query did not actually refer to an index). Instead the functions would be used directly in the query and the system would have to use the said indices.