why wrap each BytesRef in a Term when in the end you just need the BytesRef? Or maybe I'm mistaken.
I haven't really optimized things yet. I'll take a look at optimizing this.
equals and hashcode is on id yet you initialize that to new Object(). Firstly; why not have equals/hashcode actually work? Secondly, if for some reason it should be this way, then you can do away with id and do equals on instance equality of the query instance – you don't need id.
It's designed to only be equal on identity so it doesn't cache. The main reason for this is that graph traversals are typically one time jobs so I wanted to avoid the overhead of hashcode and equals on large term lists.There may be a better approach to the identity equality, so I'll review your suggestion.
I think it's very suspicious that GraphTermsQuery holds List<TermContext>; I think the Query object should not hold state pertaining to the actual index as it could cause issues with caching. Maybe you could do the construction of this in createWeight and hold it on the Weight?
This sounds like a good idea.
in no place do I see you sort the incoming terms. It's faster to seek sequentially and not randomly.
It appeared that the TermsQuery was sorting terms to account for different fields. But the GraphTermsQuery is always on one field. Since it's always doing a seekExact, I was assuming that it would always have to seek from the top of the terms enum anyway, because it can't make assumptions on the order of the terms. In this case it would seem sorting would just add overhead. But I could be wrong about this.