Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1802

Deserializing scan tokens should avoid round-trip to master

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.2.0
    • Fix Version/s: None
    • Component/s: client, perf
    • Labels:

      Description

      Currently, KuduScanToken::DeserializeIntoScanner calls KuduClient::OpenTable() which makes a GetTableSchema call to the master. This round trip is a bit expensive because it's always a "thundering herd" for an Impala query or Spark job – every host deserializes a bunch of scan tokens at the same time and ends up having to back off.

      We should consider some ways to avoid this.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: