Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39817

Missing sbin scripts in PySpark packages

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsAdd voteVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete CommentsDelete
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0, 3.2.1, 3.3.0, 3.2.2
    • None
    • PySpark
    • Patch, Important

    Description

      In the PySpark setup.py, only a subset of all scripts is included.
      I'm in particular missing the `submit-all.sh` script:

              package_data={
                  'pyspark.jars': ['*.jar'],
                  'pyspark.bin': ['*'],
                  'pyspark.sbin': ['spark-config.sh', 'spark-daemon.sh',
                                   'start-history-server.sh',
                                   'stop-history-server.sh', ],
      
                  [...]
              },
      

       

      The solution is super simple, just change 'pyspark.sbin' to:

      'pyspark.sbin': ['*'],
      

       

      I would happily submit a PR to github, but I have no clue on the organizational details.

      This would be great to get backported for pyspark 3.2.x as well as 3.3.x soon.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned Assign to me
            hoeze F. H.

            Dates

              Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 5m
              5m
              Remaining:
              Remaining Estimate - 5m
              5m
              Logged:
              Time Spent - Not Specified
              Not Specified

              Slack

                Issue deployment