Description
We need to seriously beef up our testing of PointFields to figure out what Solr features don't currently work with PointFields.
The existing Trie/Point randomization logic in SolrTestCaseJ4 is a good start – but only a handful of schema files leverage it.
Allthough a jira/SOLR-10807 branch was originally created with this goal, it was ultimately just used for initial experimentation, and has been abandoned. The "meat" of the work needed to improve how we randomize in Point fields was done in SOLR-10864, and other sub-tasks of this issue have been / are being used to track rolling out this randomization to more and more test schema files and validating the affected tests.
This effort is now highly parallelizable – so here are some rough guidelines/suggestions for folks interested in contributing to this effort:
- create a subtask identifying the name (or glob) of the test-files schema file you plan to tackle before starting (so multiple people don't duplicate work on the same tests
- run the following one liner (assumming bash/perl) to change all Trie field types in the schema(s) to use the new randomized system vars...
find -name \*your-schema-glob-or-name\* -type f | xargs perl -i -ple 's/class="solr.TrieIntField"/class="solr.TrieIntegerField"/g; s/class="solr.Trie(.*)Field"/class="\${solr.tests.$1FieldType}"/g; unless (/docValues/) { s/(class="\${solr.tests..*FieldType}")/$1 docValues="\${solr.tests.numeric.dv}"/g; }'
- identify the affected tests
- grep for the schema file names in all test classes to start building the list
- recursively check each test class in the list for subclasses
- hammer on all affected tests with many diff seeds
- NOTE: you can force the points vs trie choice by specifying -Dsolr.tests.use.numeric.points=true vs -Dsolr.tests.use.numeric.points=false
- folks with beefy machines may find it handy to use 2 git working dirs to hammer on diff seeds with a diff hardcoded values of that sysprop
- NOTE: you can force the points vs trie choice by specifying -Dsolr.tests.use.numeric.points=true vs -Dsolr.tests.use.numeric.points=false
- If you encounter any test failures...
- figure out the root cause
- file a new "Bug" jira, link it as related to
SOLR-10807&SOLR-8396- NOTE: first double check there isn't already a bug on point linked to from one of those places
- use @SuppressPointFields (citing the new jira) if necessary for any functionality that absolutely will not work with point fields
- use something like this in tests where functionality requires docValues in order to work properly with points (although in practice, the comment should always cite the relevant jira)
- use something like this if a small subset of a test is known to not work with points...
- use something like this if the test has a need/reason to care/assert what the underying FieldType is of a numeric field...
- when committing changes, the commit msg should cite both the original sub-task jira#, as well as any bug jira#s that needed special annotation/handling/assumptions - so that in the future people working on fixing those bugs have easy to find GIT SHAs identifying when/where tests are currently hacked to avoid the bugs.
Attachments
Attachments
Issue Links
- blocks
-
SOLR-10760 Remove trie field types and fields from example schemas
- Closed
- is related to
-
SOLR-10844 group.facet failures when the grouping field is Points based (or Trie w/docValues??)
- Open
-
SOLR-10845 GraphTermsQParserPlugin doesn't work with Point fields (or DocValues only fields?)
- Resolved
-
SOLR-10829 IndexSchema should enforce that uniqueKey field must not be points based
- Resolved
-
SOLR-10832 Using "indexed" PointField for _version_ breaks VersionInfo.getMaxVersionFromIndex
- Resolved
-
SOLR-10833 Numeric FooPointField classes inconsistent with TrieFooFields on malformed input: throw NumberFormatException
- Resolved
-
SOLR-10835 ExportWriter only works with TrieFooFields, not FooPointFields
- Resolved
-
SOLR-10803 Mark all Trie/LegacyNumeric based fields @deprecated in Solr7
- Closed
-
SOLR-10846 ExternalFileField/FileFloatSource throws NPE if keyField is Points based
- Closed
-
SOLR-10847 TermsComponent doesn't work with Points fields - confusing errors when using terms.list
- Closed
-
SOLR-10919 ord & rord functions give confusing errors with PointFields
- Closed
-
SOLR-10918 Hashing for IntPointFields is broken for HLL
- Closed
-
SOLR-10926 increase the odds of randomly choosing point fields in our SolrTestCaseJ4 numeric type randomization
- Resolved
-
SOLR-10834 test configs should be changed to stop using numeric based uniqueKey field
- Resolved
- relates to
-
SOLR-10177 Consolidate randomized usage of PointFields in schemas
- Resolved