Description
Let's say I have a custom partitioner on my RDD - and that RDD is registered as a SQL table and want to do a "select myfield from mytable where myudf(myfield,"some condition") = somevalue - I do not want to perform a "full table" scan to get myfield.
However, if the UDF API is extended to say at runtime "ask" where the current partition is "valid" - then it will scan it.
I see the UDF API been modified with a method such as:
readPartition(partitioner:Partitioner, partitionId:int):Boolean
where I can cast partitioner to my own custom one and based on the given partition id and runtime arguments, the method will decide to read that partition