Probably we should switch to parallel arrays here, to make it very fast... yes this will consume RAM (8 bytes per position, if we keep all of them).
Really most apps do not need all positions stored, ie, they only need to see typically the current token. So maybe we could make a filter that takes a "lookbehind size" and it'd only keep that number of mappings cached? That'd have to be > the max size of any token you may analyze, so hard to bound perfectly, but eg setting this to the max allowed token in IndexWriter would guarantee that we'd never have a miss?
For analyzers that buffer tokens... they'd have to set this max to infinity, or, ensure they remap the offsets before capturing the token's state?