Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26080

Unable to run worker.py on Windows

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.4.1, 3.0.0
    • Component/s: PySpark
    • Labels:
    • Environment:

      Windows 10 Education 64 bit

      Description

      Use of the resource module in python means worker.py cannot run on a windows system. This package is only available in unix based environments.
      https://github.com/apache/spark/blob/9a5fda60e532dc7203d21d5fbe385cd561906ccb/python/pyspark/worker.py#L25

      textFile = sc.textFile("README.md")
      textFile.first()
      

      When the above commands are run I receive the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'

      I do not really know enough about what I'm doing to fix this myself. Apologies if there's something simple I'm missing here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hyukjin.kwon Hyukjin Kwon
                Reporter:
                HaydenJ Hayden Jeune
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: