Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
4.9
Description
Indexed documents frequently contain different types in different field, e.g emails, telephone numbers, ips etc. The fields may have been extracted from the content field or originally structured that way.
We should propose a queryParser that recognizes the query token type (eg. regex) and implicitly reformulate the query to run against the matching field only. That would make a good performance boost in case the query is running on a "catch them all" field and a more adapted analyze for the different types.
It would also avoid the idf drift that occurs on an above "catch them all" field.
A workaround could be using the type token filter with the matching type whitelist and querying all the different field types with edismax's qf param.