Details
Description
Our current implementation of decimal multiplication can be slow and non-optimal due to having branches in our code.
zamsden suggested to use https://en.wikipedia.org/wiki/Karatsuba_algorithm multiplication for int128 * int128 -> int256 multiply. The following example implements this and uses 3 hardware 64-bit multiplies to get a full 256 bit result. The code is written in inline assembly and has no branches.
http://coliru.stacked-crooked.com/a/25a697389211189f
We should consider benchmarking this code and using this approach if it turns out to be faster.
Attachments
Issue Links
- is related to
-
IMPALA-6357 Use faster carries in decimal arithmetic
- Open