Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2570

Initialize rowset iterators lazily

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.7.1
    • None
    • tablet
    • None

    Description

      When a scan inits a tablet iterator, the tablet iterator inits iterator for each rowset that is deemed relevant. When there's a lot of rowsets (usually because of some missing feature like KUDU-1400 or a pathological partition schema or configuration), this can take a long time, leading to scan timeouts like

      WARNINGS: Unable to open scanner: Timed out: Scan RPC to 10.1.11.187:7050 timed 
      out after 169.988s (SENT)

      For non-fault-tolerant scans, it seems like we should be able to init the rowset when we first go to retrieve rows from, and thereby amortize all the seeks to open rowsets across many ScanRequest RPC round trips.

      For fault-tolerant scans, things are more complicated, but still it should be possible to be lazier about initting rowset iterators.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wdberkeley William Berkeley
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: