Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33017

PySpark Context should have getCheckpointDir() method

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 3.1.0
    • Fix Version/s: 3.1.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      To match the Scala API, PySpark should offer a direct way to get the checkpoint dir.

      scala> spark.sparkContext.setCheckpointDir("/tmp/spark/checkpoint")
      scala> spark.sparkContext.getCheckpointDir
      res3: Option[String] = Some(file:/tmp/spark/checkpoint/34ebe699-bc83-4c5d-bfa2-50451296cf87)
      

      Currently, the only was to do that from PySpark is via the underlying Java context:

      >>> spark.sparkContext.setCheckpointDir('/tmp/spark/checkpoint/')
      >>> sc._jsc.sc().getCheckpointDir().get()
      'file:/tmp/spark/checkpoint/ebf0fab5-edbc-42c2-938f-65d5e599cf54'
      

        Attachments

          Activity

            People

            • Assignee:
              reidy-p Paul Reidy
              Reporter:
              nchammas Nicholas Chammas

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment