Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Our data driven schema guessing doesn't work under many situations. For example, if the first document has a field with value "0", it is guessed as Long and subsequent fields with "0.0" are rejected. Similarly, if the same field had alphanumeric contents for a latter document, those documents are rejected. Also, single vs. multi valued field guessing is not ideal.
Proposing an offline training mode where Solr accepts bunch of documents and returns a guessed schema (without indexing). This schema can then be used for actual indexing. I think the original idea is from Hoss.
I think initial implementation can be based on an UpdateRequestProcessor. We can hash out the API soon, as we go along.
Attachments
Attachments
Issue Links
- is related to
-
SOLR-6939 UpdateProcessor to buffer & sample documents and then batch create neccessary fields
- Open
-
SOLR-14701 Deprecate Schemaless Mode (Discussion)
- Open
- relates to
-
SOLR-15277 Schema Designer in Admin UI
- Closed