Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Won't Do
-
2.4.6, 3.0.0
-
None
-
None
Description
when i use spark rdd . i often use to read kafka data.And kafka data has lots of kinds data set.
I filter these rdd by kafka key , then i can use Array[rdd] to fill every topic rdd.
But at that , i use rdd.filter,that will generate more than one stage.Data will process by many task, that consume too many time. And it is not necessary.
i hope add multiple filter function not rdd.filter ,that will return Array[RDD] in one stage by dividing all mixture data RDD to single data set RDD .
function like Array[RDD]=rdd.multiplefilter(setcondition).