HunspellStemFilterTest is the only lucene/analysis test i see using setEnableChecks
It sets it to true, which is dead code (true is the default!).
although there do seem to be some highlighter tests that use it
Highlighter has a built-in limiter, that limits not based on tokencount, but accumulated # of analyzed chars.
So it disables it for the same reason as LimitTokenCount does or should
2) i don't see any existing tests for LimitTokenCountFilter .. were they deleted by mistake?
I think these are in TestLimitTokenCountAnalyzer? For lucene users this is the way you use this (just wrap your analyzer).
3) the closest thing i see to a test of LimitTokenCountFilter is TestLimitTokenCountAnalyzer - i realize now the reason it's testLimitTokenCountAnalyzer doesn't get the same failure is because it's wrapping WhitespaceAnalyzer, StandardAnalyzer - should those be changed to use MockTokenizer?
I think we should always do this!
4) TestLimitTokenCountAnalyzer also has a testLimitTokenCountIndexWriter that uses MockAnalyzer w/o calling setEnableChecks(false) which seems like it should trigger the same failure i got since it uses MockTokenizer, but in general that test looks suspicious, as it seems to add the exact number of tokens that the limit is configured for, and then asserts that the last token is in the index - but never actaully triggers the limiting logic since exactly the allowed umber of tokens are used.
Then thats fine, because when LimitTokenCountFilter consumes the whole stream, its a "good consumer". its only when it actually truncates that it breaks the tokenstream contract.