Description
I would like to add a new simple QueryParser to Lucene that is designed to parse human-entered queries. This parser will operate on an entire entered query using a specified single field or a set of weighted fields (using term boost).
All features/operations in this parser can be enabled or disabled depending on what is necessary for the user. A default operator may be specified as either 'MUST' representing 'and' or 'SHOULD' representing 'or.' The features/operations that this parser will include are the following:
- AND specified as '+'
- OR specified as '|'
- NOT specified as '-'
- PHRASE surrounded by double quotes
- PREFIX specified as '*'
- PRECEDENCE surrounded by '(' and ')'
- WHITESPACE specified as ' ' '\n' '\r' and '\t' will cause the default operator to be used
- ESCAPE specified as '\' will allow operators to be used in terms
The key differences between this parser and other existing parsers will be the following:
- No exceptions will be thrown, and errors in syntax will be ignored. The parser will do a best-effort interpretation of any query entered.
- It uses minimal syntax to express queries. All available operators are single characters or pairs of single characters.
- The parser is hand-written and in a single Java file making it easy to modify.