Description
One of the warnings we get from R CMD check is that RDD implementations of some of the generics are not documented. These generics are shared between RDD, DataFrames in SparkR. The list includes
WARNING
Undocumented S4 methods:
generic 'cache' and siglist 'RDD'
generic 'collect' and siglist 'RDD'
generic 'count' and siglist 'RDD'
generic 'distinct' and siglist 'RDD'
generic 'first' and siglist 'RDD'
generic 'join' and siglist 'RDD,RDD'
generic 'length' and siglist 'RDD'
generic 'partitionBy' and siglist 'RDD'
generic 'persist' and siglist 'RDD,character'
generic 'repartition' and siglist 'RDD'
generic 'show' and siglist 'RDD'
generic 'take' and siglist 'RDD,numeric'
generic 'unpersist' and siglist 'RDD'
As described in https://stat.ethz.ch/pipermail/r-devel/2003-September/027490.html this looks like a limitation of R where exporting a generic from a package also exports all the implementations of that generic.
One way to get around this is to remove the RDD API or rename the methods in Spark 2.1