Accumulo
  1. Accumulo
  2. ACCUMULO-1753

evaluate options for optimizing a SortedKeyValueIterator-like API on the client side

    Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently, there is a significant performance difference between running something like the IntersectingIterator in the TabletServer vs. in the ClientSideIteratorScanner, even on a single node with the client running alongside the TabletServer. There are many reasons for this difference, including:
      1. Batching between the TabletServer and Scanner makes frequent seeks inefficient.
      2. The wire protocol used by the Scanner makes seeks inefficient.
      3. Interprocess communication in general adds latency.
      4. Encoding and decoding adds latency.

      Server-side iterators are still going to be better even if we completely optimize client-side iterators. However, server-side iterators come with risks to stability and security, especially if the set of iterators grows quickly. If we could optimize some of these problems, we could theoretically enable more programmability of complex operations with less risk to security and reliability of a multi-tenant instance of Accumulo.

      This ticket is related to the concept of running iterators in a separate process, but includes the RPC aspect as well.

        Activity

        There are no comments yet on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Adam Fuchs
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development