Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26080

Unable to run worker.py on Windows

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.4.0
    • 2.4.1, 3.0.0
    • PySpark
    • Windows 10 Education 64 bit

    Description

      Use of the resource module in python means worker.py cannot run on a windows system. This package is only available in unix based environments.
      https://github.com/apache/spark/blob/9a5fda60e532dc7203d21d5fbe385cd561906ccb/python/pyspark/worker.py#L25

      textFile = sc.textFile("README.md")
      textFile.first()
      

      When the above commands are run I receive the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'

      I do not really know enough about what I'm doing to fix this myself. Apologies if there's something simple I'm missing here.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gurwls223 Hyukjin Kwon
            HaydenJ Hayden Jeune
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment