[IMPALA-3066] Break cross-compiler IR for built-in functions into multiple modules - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Won't Fix
Affects Version/s: Impala 2.5.0
Fix Version/s: None
Component/s: Backend
Labels:
- codegen

Target Version:

Product Backlog

Description

We currently include all functions and cross-compiled IR into a single LLVM bitcode module that is parsed and loaded into memory for every query. We eliminate dead code early in optimisation process, but we still pay the cost for building and walking the in-memory IR for functions that are not used by the query. ~100ms of CodeGen is PrepareTime for the module.

A lot of the module is infrequently used functions:

tarmstrong@tarmstrong-box:~/Impala/Impala$ llvm-dis llvm-ir/impala-sse.bc
tarmstrong@tarmstrong-box:~/Impala/Impala$ grep 'Reservoir' llvm-ir/impala-sse.ll  | wc -l3705
tarmstrong@tarmstrong-box:~/Impala/Impala$ grep 'Timestamp' llvm-ir/impala-sse.ll  | wc -l 
3015
tarmstrong@tarmstrong-box:~/Impala/Impala$ grep 'boost' llvm-ir/impala-sse.ll  | wc -l
13240

We already have a mechanism for loading LLVM bitcode on demand for UDFs when they are referenced by a query. We should split out built-in functions into multiple LLVM modules and only load them when required for the query.

Attachments

Activity

People

Assignee:: Tim Armstrong

Reporter:: Tim Armstrong

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Feb/16 22:10

Updated:: 17/Oct/16 21:03

Resolved:: 17/Oct/16 21:03