Details
-
Sub-task
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
3.3.0
-
None
Description
FileCommitProtocol is the class to commit Spark job output (staging file & directory renaming, etc). During Spark 3.2 development, we added new functions into this class to allow more flexible output file naming (the PR detail is here). We didn’t delete the existing file naming functions (newTaskTempFile(ext) & newTaskTempFileAbsPath(ext)), because we were aware of many other downstream projects or codebases already implemented their own custom implementation for FileCommitProtocol. Delete the existing functions would be a breaking change for them when upgrading Spark version, and we would like to avoid this unpleasant surprise for anyone if possible. But we also need to clean up legacy as we evolve our codebase.
So for next step, I would like to propose:
- Spark 3.3 (now): Add @deprecate annotation to legacy functions in FileCommitProtocol - newTaskTempFile(ext) & newTaskTempFileAbsPath(ext).
- Next Spark major release (or whenever people feel comfortable): delete the legacy functions mentioned above from our codebase.
Attachments
Issue Links
- links to