Description
The first RDD doesn't need to be cached, other cost RDDs should use MEMORY_AND_DISK to avoid recomputing.
Attachments
Issue Links
- is related to
-
SPARK-12450 Un-persist broadcasted variables in KMeans
- Resolved
- relates to
-
SPARK-10329 Cost RDD in k-means|| initialization is not storage-efficient
- Resolved
- links to