For float_type, lexical_cast is replace by snprintf （issue https://issues.apache.org/jira/browse/IMPALA-1738），but why not do the same replacement for num_type.
Test is done in 2 sql case :
1) group by cast int to string :
select cast(f1 as string) as kk, count ( * ) from test.my_table group by kk;
2) group by int :
select f1 as kk, count( * ) from test.my_table group by kk;
from the benchmark, we can see that performance decreased seriously with more thread.
But using snprintf, performance improved significantly with more thread .