Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-1753

evaluate options for optimizing a SortedKeyValueIterator-like API on the client side

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Currently, there is a significant performance difference between running something like the IntersectingIterator in the TabletServer vs. in the ClientSideIteratorScanner, even on a single node with the client running alongside the TabletServer. There are many reasons for this difference, including:
      1. Batching between the TabletServer and Scanner makes frequent seeks inefficient.
      2. The wire protocol used by the Scanner makes seeks inefficient.
      3. Interprocess communication in general adds latency.
      4. Encoding and decoding adds latency.

      Server-side iterators are still going to be better even if we completely optimize client-side iterators. However, server-side iterators come with risks to stability and security, especially if the set of iterators grows quickly. If we could optimize some of these problems, we could theoretically enable more programmability of complex operations with less risk to security and reliability of a multi-tenant instance of Accumulo.

      This ticket is related to the concept of running iterators in a separate process, but includes the RPC aspect as well.

      Attachments

        Activity

          People

            Unassigned Unassigned
            afuchs Adam Fuchs
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: