Details
Description
Use of the resource module in python means worker.py cannot run on a windows system. This package is only available in unix based environments.
https://github.com/apache/spark/blob/9a5fda60e532dc7203d21d5fbe385cd561906ccb/python/pyspark/worker.py#L25
textFile = sc.textFile("README.md")
textFile.first()
When the above commands are run I receive the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'
I do not really know enough about what I'm doing to fix this myself. Apologies if there's something simple I'm missing here.
Attachments
Attachments
Issue Links
- is duplicated by
-
SPARK-26670 worker.py uses resource python library, which is not available for windows
- Resolved
- relates to
-
SPARK-25004 Add spark.executor.pyspark.memory config to set resource.RLIMIT_AS
- Resolved
- links to