Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
New
Description
I thought we had cleaned out the python2, but we got one straggler left.
Currently this is set to run with python2, but it should be using python3. Python3 will generate the exact same sources that are present in master today. But if you run it with python2 (as currently configured) it generates a slightly different grammar:
--- a/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex +++ b/lucene/analysis/common/src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex @@ -60,7 +60,7 @@ CharacterEntities = ( "AElig" | "Aacute" | "Acirc" | "Agrave" | "Alpha" | "times" | "trade" | "uArr" | "uacute" | "uarr" | "ucirc" | "ugrave" | "uml" | "upsih" | "upsilon" | "uuml" | "weierp" | "xi" | "yacute" | "yen" | "yuml" | "zeta" - | "zwj" | "zwnj" ) +(' | "zwj" | "zwnj"', ')')
This then cascades and causes HTMLStripCharFilter.java to be regenerated differently too with a different DFA.