Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
From below test, summation kernel is of lower precision than numpy.sum.
Numpy implements pairwise summation [1] with O(logn) round-off error, better than O(n) error from naive summation.
sum.py
import numpy as np import pyarrow.compute as pc t = np.arange(321000, dtype='float64') t2 = t - np.mean(t) t2 *= t2 print('numpy sum:', np.sum(t2)) print('arrow sum:', pc.sum(t2))
test result
# Verified with wolfram alpha (arbitrary precision), Numpy's result is correct. $ ARROW_USER_SIMD_LEVEL=SSE4_2 python sum.py numpy sum: 2756346749973250.0 arrow sum: 2756346749973248.0 $ ARROW_USER_SIMD_LEVEL=AVX2 python sum.py numpy sum: 2756346749973250.0 arrow sum: 2756346749973249.0
Attachments
Issue Links
- relates to
-
ARROW-11567 [C++][Compute] Variance kernel has precision issue
- Resolved
- links to