sorry for missing your last response.
About the raw term: The raw term is only shown by solr currently, if the term is only binary (like numerics) or similar (when the FieldType does some transformation like with the deprecated Sortable*) fields. I just mentioned it as example that I was missing some attributes in your example output. To solve your problem it is of no use.
I already mentioned:
One possibility to handle the thing might be the char offset in the original text, because that the req handler may use the character offset of begin and end of the token in the original stream instead of the token position, but this is likely to break for lots of TokenFilters (WordDelimiterFilter would work as long as you don't do stemming before...). The problem is incorrect handling of offset calculation (also leading to bugs in highlighting) when the inserted terms are longer than their originals.
This might be your only chance (using the OffsetAttribute), but it is likely to break. What you want to have is not possible with the analysis API of Lucene, as some information is missing (as not needed during analysis - the absolute positions are not important for the indexer, so TokenStreams don't preserve them.
A possibility to preserve the original positions would be a trick in the analysis RequestHandler: It could insert a Fake TokenFilter directly after the Tokenizer, that adds an additional Attribute with the absolute position (incremented on each call to input.incrementToken()). This could be a hack to achieve what you want.
Maybe I can help you, but that needs some refactoring in AnalysisRequestHandlers, but might be a good idea.