Type: New Feature
Affects Version/s: None
Fix Version/s: None
Currently, there is a significant performance difference between running something like the IntersectingIterator in the TabletServer vs. in the ClientSideIteratorScanner, even on a single node with the client running alongside the TabletServer. There are many reasons for this difference, including:
1. Batching between the TabletServer and Scanner makes frequent seeks inefficient.
2. The wire protocol used by the Scanner makes seeks inefficient.
3. Interprocess communication in general adds latency.
4. Encoding and decoding adds latency.
Server-side iterators are still going to be better even if we completely optimize client-side iterators. However, server-side iterators come with risks to stability and security, especially if the set of iterators grows quickly. If we could optimize some of these problems, we could theoretically enable more programmability of complex operations with less risk to security and reliability of a multi-tenant instance of Accumulo.
This ticket is related to the concept of running iterators in a separate process, but includes the RPC aspect as well.