Description
Using `np.bool` generates this warning:
UserWarning: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.pyspark.enabled' is set to true, but has reached the error below and can not continue. Note that 'spark.sql.execution.arrow.pyspark.fallback.enabled' does not have an effect on failures in the middle of computation.
3070E `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
3071E Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
See Numpy's deprecation statement here: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Attachments
Issue Links
- is duplicated by
-
SPARK-41718 Numpy 1.24 breaks PySpark due to use of `np.bool` instead of `np.bool_` in many places
- Resolved
- is related to
-
SPARK-42647 Remove aliases from deprecated numpy data types
- Resolved
- links to