[SOLR-10229] See what it would take to shift many of our one-off schemas used for testing to managed schema and construct them as part of the tests - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Wish
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

The test schema files are intimidating. There are about a zillion of them, and making a change in any of them risks breaking some other test. That leaves people three choices:

1> add what they need to some existing schema. Which makes schemas bigger and bigger and bigger.

2> create a new schema file, adding to the proliferation thereof.

3> Look through all the existing tests to see if they have something that works.

The recent work on ~~LUCENE-7705~~ is a case in point. We're adding a maxLen parameter to some tokenizers. Putting those parameters into any of the existing schemas, especially to test < 255 char tokens is virtually guaranteed to break other tests, so the only safe thing to do is make another schema file. Adding to the multiplication of files.

As part of ~~SOLR-5260~~ I tried creating the schema on the fly rather than creating a new static schema file and it's not hard. WDYT about making this into some better thought-out utility?

At present, this is pretty fuzzy, I wanted to get some reactions before putting much effort into it. I expect that the utility methods would eventually get a bunch of canned types. It's reasonably straightforward for primitive types, if lengthy. But when you get into solr.TextField-based types it gets less straight-forward.

We could manage to just move the "intimidation" from the plethora of schema files to a zillion fieldTypes in the utility to choose from...

Also, forcing every test to define the fields up-front is arguably less convenient than just having some canned schemas we can use. And erroneous schemas to test failure modes are probably not very good fits for any such framework.

steve_rowe and hossman_lucene@fucit.org in particular might have something to say.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-10229.patch
24/Jul/17 18:05
87 kB
Erick Erickson
SOLR-10229.patch
23/Jul/17 17:44
93 kB
Erick Erickson
SOLR-10229.patch
13/Apr/17 18:01
38 kB
Amrit Sarkar
SOLR-10229.patch
08/Apr/17 10:01
73 kB
Amrit Sarkar
SOLR-10229.patch
08/Apr/17 09:00
72 kB
Amrit Sarkar
SOLR-10229.patch
08/Apr/17 08:55
72 kB
Amrit Sarkar
SOLR-10229.patch
06/Apr/17 05:25
43 kB
Amrit Sarkar
SOLR-10229.patch
31/Mar/17 19:10
6 kB
Amrit Sarkar
SOLR-10229-straw-man.patch
04/Jul/17 05:19
80 kB
Erick Erickson

Issue Links

is blocked by

SOLR-11034 Redundent/Unneccessary SolrCore reload when ManagedIndexSchema changes are made in cloud mode

Open

SOLR-11035 (at least) 2 distinct failures possible when clients attempt searches during SolrCore reload

Closed

is depended upon by

LUCENE-7705 Allow CharTokenizer-derived tokenizers and KeywordTokenizer to configure the max token length

Resolved

is related to

SOLR-12801 Fix the tests, remove BadApples and AwaitsFix annotations, improve env for test development.

Open

Activity

People

Assignee:: Unassigned

Reporter:: Erick Erickson

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 05/Mar/17 21:48

Updated:: 27/Jun/20 11:37