Details
-
Improvement
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
1.17.1, 1.19.0
Description
Hi! I am part of a team building the Flink backend for Ibis (https://github.com/ibis-project/ibis). We would like to leverage PyFlink under the hood for execution; however, PyFlink's requirements are incompatible with several other Ibis requirements. Beyond Ibis, PyFlink's outdated and restrictive requirements prevent it from being used alongside most recent releases of Python data libraries.
Some of the major libraries we (and likely others in the Python community interested in using PyFlink alongside other libraries) need compatibility with:
- PyArrow (at least >=10.0.0, but there's no reason not to be also be compatible with latest)
- pandas (should be compatible with 2.x series, but also probably with 1.4.x, released January 2022, and 1.5.x)
- numpy (1.22 was released in December 2022)
- Newer releases of Apache Beam
- Newer releases of cython
Furthermore, uncapped dependencies could be more generally preferable, as they avoid the need for frequent PyFlink releases as newer versions of libraries are released. A common (and great) argument for not upper-bounding dependencies, especially for libraries: https://iscinumpy.dev/post/bound-version-constraints/
I am currently testing removing upper bounds in https://github.com/apache/flink/pull/23141; so far, builds pass without issue in b65c072, and I'm currently waiting on c8eb15c to see if I can get PyArrow to resolve >=10.0.0. Solving the proposed dependencies results in:
#
# This file is autogenerated by pip-compile with Python 3.8
# by the following command:
#
# pip-compile --config=pyproject.toml --output-file=dev/compiled-requirements.txt dev/dev-requirements.txt
#
apache-beam==2.49.0
# via -r dev/dev-requirements.txt
avro-python3==1.10.2
# via -r dev/dev-requirements.txt
certifi==2023.7.22
# via requests
charset-normalizer==3.2.0
# via requests
cloudpickle==2.2.1
# via
# -r dev/dev-requirements.txt
# apache-beam
crcmod==1.7
# via apache-beam
cython==3.0.0
# via -r dev/dev-requirements.txt
dill==0.3.1.1
# via apache-beam
dnspython==2.4.1
# via pymongo
docopt==0.6.2
# via hdfs
exceptiongroup==1.1.2
# via pytest
fastavro==1.8.2
# via
# -r dev/dev-requirements.txt
# apache-beam
fasteners==0.18
# via apache-beam
find-libpython==0.3.0
# via pemja
grpcio==1.56.2
# via
# -r dev/dev-requirements.txt
# apache-beam
# grpcio-tools
grpcio-tools==1.56.2
# via -r dev/dev-requirements.txt
hdfs==2.7.0
# via apache-beam
httplib2==0.22.0
# via
# -r dev/dev-requirements.txt
# apache-beam
idna==3.4
# via requests
iniconfig==2.0.0
# via pytest
numpy==1.24.4
# via
# -r dev/dev-requirements.txt
# apache-beam
# pandas
# pyarrow
objsize==0.6.1
# via apache-beam
orjson==3.9.2
# via apache-beam
packaging==23.1
# via pytest
pandas==2.0.3
# via -r dev/dev-requirements.txt
pemja==0.3.0 ; platform_system != "Windows"
# via -r dev/dev-requirements.txt
pluggy==1.2.0
# via pytest
proto-plus==1.22.3
# via apache-beam
protobuf==4.23.4
# via
# -r dev/dev-requirements.txt
# apache-beam
# grpcio-tools
# proto-plus
py4j==0.10.9.7
# via -r dev/dev-requirements.txt
pyarrow==11.0.0
# via
# -r dev/dev-requirements.txt
# apache-beam
pydot==1.4.2
# via apache-beam
pymongo==4.4.1
# via apache-beam
pyparsing==3.1.1
# via
# httplib2
# pydot
pytest==7.4.0
# via -r dev/dev-requirements.txt
python-dateutil==2.8.2
# via
# -r dev/dev-requirements.txt
# apache-beam
# pandas
pytz==2023.3
# via
# -r dev/dev-requirements.txt
# apache-beam
# pandas
regex==2023.6.3
# via apache-beam
requests==2.31.0
# via
# apache-beam
# hdfs
six==1.16.0
# via
# hdfs
# python-dateutil
tomli==2.0.1
# via pytest
typing-extensions==4.7.1
# via apache-beam
tzdata==2023.3
# via pandas
urllib3==2.0.4
# via requests
wheel==0.41.0
# via -r dev/dev-requirements.txt
zstandard==0.21.0
# via apache-beam
# The following packages are considered to be unsafe in a requirements file:
# pip
# setuptools
Attachments
Attachments
Issue Links
- links to
- mentioned in
-
Page Loading...