Mostafa Mokhtar did perf profiling for COMPUTE STATS TABLESAMPLE and discovered that a lot of time is spent on finalizing HLL intermediates. Most time is spent in powf().
Relevant snippet from AggregateFunctions::HllFinalEstimate() in aggregate-functions-ir.cc:
Since we're doing a power of 2 using ldexp() should be much more efficient.
I did a microbenchmark and found that ldexp() is >10x faster than powf() for this scenario.