[KUDU-1259] C++ Client scanner API uses lots of RAM for empty projections - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: Public beta
Fix Version/s: 0.7.0
Component/s: client, impala
Labels:
None

Target Version/s:

0.7.0
Code Review:
http://gerrit.cloudera.org:8080/#/c/1562/

Description

Currently, the server side will base the number of rows per batch to achieve a few MB per response, or a time budget, whichever comes first. But, when scanning an empty projection (for a COUNT query in Impala for example), the size budget is never achieved, so we can end up with response batches with upwards of 80M rows. This is fine on the RPC layer, but when the client tries to expand these into a vector<KuduRowResult>, we end up taking sizeof(KuduRowResult)*80M = 1.3GB of RAM.

When running COUNT queries on Impala against tables with lots of tablets, this quickly pushes the impalad up to 60GB+ RAM usage given it starts a number of scanners in parallel.

Attachments

Activity

People

Assignee:: Todd Lipcon

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Nov/15 23:17

Updated:: 25/Jan/16 21:32

Resolved:: 25/Jan/16 21:32