Description
To fully support Spark RDD's persistence options, we need a few features to provide.
We need to support:
- Default persistence on memory or disk.
- Persistence using memory and disk at the same time (spill).
- Persistence on off-heap memory
- Replication for persisted data
- Disable changing persist strategy after a RDD is executed
- Report the actual state of cached data to optimizer
Attachments
1.
|
Implement disk and memory persistence (Spill) | Open | Unassigned | |
2.
|
Support RDD caching | Resolved | Sanha Lee | |
3.
|
Implement off-heap memory persistence | Open | Unassigned | |
4.
|
Implement replication for persisted data | Open | Unassigned | |
5.
|
Disable changing persistence strategy after a RDD is calculated | Open | Unassigned | |
6.
|
Report the actual state of cached data to optimizer | Open | Unassigned |