Description
useDocValuesAsStored
Many times a value will be both stored="true" and docValues="true" which requires redundant data to be stored on disk. Since reading from docValues is both efficient and a common practice (facets, analytics, streaming, etc), reading values from docValues when a stored version of the field does not exist would be a valuable disk usage optimization.
The only caveat with this that I can see would be for multiValued fields as they would always be returned sorted in the docValues approach. I believe this is a fair compromise.
I've done a rough implementation for this as a field transform, but I think it should live closer to where stored fields are loaded in the SolrIndexSearcher.
Two open questions/observations:
1) There doesn't seem to be a standard way to read values for docValues, facets, analytics, streaming, etc, all seem to be doing their own ways, perhaps some of this logic should be centralized.
2) What will the API behavior be? (Below is my proposed implementation)
Parameters for fl:
- fl="docValueField"
- return field from docValue if the field is not stored and in docValues, if the field is stored return it from stored fields
- fl="*"
- return only stored fields
- fl="+"
- return stored fields and docValue fields
2a - would be easiest implementation and might be sufficient for a first pass. 2b - is current behavior
Attachments
Attachments
Issue Links
- depends upon
-
SOLR-8339 SolrDocument and SolrInputDocument should have a common interface
- Closed
- is depended upon by
-
SOLR-8276 Atomic updates & RTG don't work with non-stored docvalues
- Closed
- is related to
-
SOLR-8344 Decide default when requested fields are both column and row stored.
- Closed
-
SOLR-5478 Optimization: Fetch all "fl" values from docValues instead of stored values if possible/equivalent
- Closed
-
SOLR-8316 Allow a field to be stored=false indexed=false docValues=true
- Closed
- relates to
-
SOLR-9612 Stored field access should be avoided when it's not needed
- Open