Thanks to the new kernel framework, now it can override a SIMD kernel version at runtime.
Below is the things we need implemented for a SIMD version of sum kernel :
1. Add SIMD path for aggregate sum dense.
2. Add build support to append the compiler flag for SIMD code file.
3. Register the SIMD version at runtime as the CPU feature.