Description
Right now, we are using expressions for Random distribution generating expressions. But, we have to track them in lots of places in the optimizer to handle them carefully. Otherwise, these expressions will be treated as stateless expressions and have unexpected behaviors (e.g. SPARK-8023).
Attachments
Issue Links
- blocks
-
SPARK-7157 Add approximate stratified sampling to DataFrame
- Resolved