Description
The source code/API docs for the pyspark RDD map function says:
def map(self, f, preservesPartitioning=False):
"""
Return a new RDD containing the distinct elements in this RDD.
"""
def func(split, iterator): return imap(f, iterator)
return PipelinedRDD(self, func, preservesPartitioning)
I think that was incorrectly cut-and-pasted from the distinct() function, and should actually say "Return a new RDD by applying a function to each element of this RDD."