Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39817

Missing sbin scripts in PySpark packages

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.0, 3.2.1, 3.3.0, 3.2.2
    • None
    • PySpark
    • Patch, Important

    Description

      In the PySpark setup.py, only a subset of all scripts is included.
      I'm in particular missing the `submit-all.sh` script:

              package_data={
                  'pyspark.jars': ['*.jar'],
                  'pyspark.bin': ['*'],
                  'pyspark.sbin': ['spark-config.sh', 'spark-daemon.sh',
                                   'start-history-server.sh',
                                   'stop-history-server.sh', ],
      
                  [...]
              },
      

       

      The solution is super simple, just change 'pyspark.sbin' to:

      'pyspark.sbin': ['*'],
      

       

      I would happily submit a PR to github, but I have no clue on the organizational details.

      This would be great to get backported for pyspark 3.2.x as well as 3.3.x soon.

      Attachments

        Activity

          People

            Unassigned Unassigned
            hoeze F. H.
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 5m
                5m
                Remaining:
                Remaining Estimate - 5m
                5m
                Logged:
                Time Spent - Not Specified
                Not Specified