Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
7.1
-
None
-
New
Description
I’m facing a problem with taxonomy writer cache inconsistency. At some point in time UTF8TaxonomyWriterCache starts to return wrong ord for some facet labels. As result wrong ord are written in doc facet fields, and wrong counts are returned (undercount) during search. This bug is manifested on different servers with different index contents (we have several separate indexes with unique data).
Unfortunately I can’t reproduce this behaviour in tests.
I've dumped "broken" UTF8TaxonomyWriterCache instance and created app to load it and to compare with real taxonomy. Dumps and app are in attachment. To run demo extract archives content and exec:
mvn compile
mvn exec:java -Dexec.mainClass="me.torobaev.lucene.taxonomy.cache.TaxonomyCacheCheck" -DtaxonomyDir=../taxonomy/ -DcacheDump=../taxonomy-cache.json
As you can see, labels [frametype, 7] and [modification_id, 682] have same ord in cache.