Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
I've been thinking it would be useful to have a minimal Cython package, call it "cyarrow", containing some pxd files and a small amount of compiled pyx code (using a C compiler only) that enables projects written in Cython to interact with Arrow datasets in minimal ways (for example, iterating over their values, interacting with dictionary-encoded/categorical arrays) that don't amount to reimplementation of the "hard stuff" where they would want to utilize pyarrow or the C++ library instead. Otherwise, every Python project that has compiled code in Cython and wants to use the C interface (https://github.com/apache/arrow/blob/master/docs/source/format/CDataInterface.rst) would have to create their own minimal implementation.
Target user for this project would be Python projects like scikit-learn that are mostly written in Cython