Uploaded image for project: 'Accumulo'
  1. Accumulo
  2. ACCUMULO-3978

Enable safe and efficient Scans within tserver iterators

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • tserver
    • None

    Description

      Operations such as joining two tables require scanning more than one table. Naive solutions use a client to scan two tables from Accumulo and write the result back to Accumulo. Performing the join inside Accumulo eliminates needless network traffic and gains significant performance.

      One way to do a join inside Accumulo is to scan the first table with a scan-time iterator, which opens another Scanner to the second table. The approach works well most of the time. In rare cases when tablet servers are constrained for resources and clients aggressively use many scan threads, scans-within-scans lead to deadlock. See ACCUMULO-3975.

      We can gain both stability (eliminating rare but possible deadlock) and performance by supporting scans-within-scans more directly than opening up a new Scanner. For the sake of argument, suppose we create a new method on the IteratorEnvironment interface called getInstanceScanner (and respectively BatchScanner), which returns a Scanner implementation connected to the scanned Instance with the same user and authorizations as the user that initiated the original scan.

      This new Scanner can now take advantage of the fact that it is already inside a tablet server. If it is used to scan a tablet that is in the same tablet server, then it can open the appropriate RFiles and in-memory maps and read them by direct method call, rather than go through a Thrift serialize-transmit-deserialize path from one port of the tablet server to another. If it is used to scan a tablet in another tablet server, then it can use Thrift as usual.

      The above scheme always avoids deadlock as long as a user restricts himself to scans-within-scans. To prevent scans-within-scans-within-scans from deadlocking, we could store the original scan ID with scans, or spawn a new thread in a tablet server whenever it goes into a waiting state as the result of opening a new Scanner and waiting to receive entries. Or maybe some re-entrant work could do the trick. Or we don't support it at all.

      Further optimizations and ideas are possible. For example we may use scans from one tablet server to another as an opportunity to replicate data between tablet server. Let's use good design to choose a simple and reliable idea.

      Remember that if the scan-inside-a-scan is to the same table as the original scan, an SKVI deepCopy may solve the problem. There may be some mechanisms in deepCopy we can reuse or adapt.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              shutchis Shana Hutchison
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: