Details
-
Task
-
Status: Resolved
-
Minor
-
Resolution: Won't Do
-
Impala 2.2
-
None
Description
While investigating the performance of "select * from tpch.lineitem" thrift seemed to be very slow. I did a benchmark comparing thrift and captnproto to transfer a total of 1gb using 1mb responses over the loopback. captnproto took ~0.5 seconds where thrift took ~6 seconds. Thrift was setup similar to how it's used in Impala. Todd was able to get the thrift timing down to ~1.7 seconds with a few simple tweaks that aren't used by Impala. The improvements are
1) Generate code using templates. Without this, thrift generates inheritance style code which results in a virtual call to read and write every data point (such as an int).
2) Use a framed transport. The problem was that even when using the buffered transport, the default buffer was too small (Impala also uses the defualt buffer size). It's possible all we need to do is increase the buffer size. Testing should be done on a real cluster since the loopback could give very different results.
Attachments
Issue Links
- is duplicated by
-
IMPALA-9382 Prototype denser runtime profile implementation
- Resolved
- is related to
-
IMPALA-4568 Cache Parquet footer cache to speedup scans & predicate evaluation against Min/Max indexes
- Open