[DRILL-6161] Allocate memory for outgoing vectors based on sizing calculations - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.12.0
Fix Version/s: 1.14.0
Component/s: Execution - Flow
Labels:
- ready-to-commit

Description

Currently, in drill, we allocate memory for outgoing value vectors either for max value of 64k entries or start from 4096 and keep doubling as we need more memory. Every time we double, we allocate a new vector and do a copy. We also zero fill the new half. This has performance penalty. As part of batch sizing project, based on incoming batch(es) sizing information, we are limiting number of rows in outgoing batch based on memory. Since we know the number of rows and the average size of each column in the outgoing batch, we should use that information to preallocate memory for the outgoing vectors. This will be done as each operator is being changed to adhere to produce configured batch sizes.

Another improvement that can be done is packing the value vectors as dense as possible to improve the over all memory utilization. Since we allocate memory in powers of 2, once we figure out the number of rows to include in the outgoing batch, round it down to closest power of 2 and allocate memory for that many rows.

Attachments

Issue Links

relates to

DRILL-6238 Batch sizing for operators

Open

Activity

People

Assignee:: Padma Penumarthy

Reporter:: Padma Penumarthy

Reviewer:: Paul Rogers

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 15/Feb/18 02:59

Updated:: 12/Jun/18 22:07

Resolved:: 12/Jun/18 22:07