Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Since in the relatively near future, one will be able to do non-trivial analytical operations and query processing natively on Arrow data structures through pyarrow, it does not make sense to require users to always install NumPy when they install pyarrow. I propose to split the NumPy-depending parts of libarrow_python into a libarrow_numpy (which also must be bundled) and moving this part of the codebase into a separate Cython module.
This refactoring should be relatively painless though there may be a number of packaging details to chase up since this would introduce a new shared library to be installed in various packaging targets.
Attachments
Issue Links
- is related to
-
ARROW-16820 C++ project doesn't need to require Numpy with ARROW_PYTHON=ON because it only builds Python extension's support libraries
- Open