Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
4.0.0
Description
Summary:
This change proposes to log a warning message when the .unpersist() method is called on RDDs that have been locally checkpointed in Apache Spark. This aims to inform users about the potential risks of unpersisting such RDDs without altering the existing behavior of the method.
Background:
Local checkpointing in Spark truncates the lineage of an RDD, meaning that the RDD cannot be recomputed from its source. If an RDD that has been locally checkpointed is unpersisted, it loses its data and cannot be regenerated. This can lead to job failures if subsequent actions or transformations are attempted on the unpersisted RDD.
Proposed Change:
To mitigate this issue, a warning message will be logged whenever .unpersist() is called on a locally checkpointed RDD. This approach maintains the current functionality while alerting users to the potential consequences of their actions. This change is intended to be non-disruptive and is a step towards better user awareness and debugging.
Attachments
Issue Links
- links to