Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.5.2
-
None
Description
in R, unique() can drop duplicated rows on all columns. And something like
df[!duplicated(df[,c('x1', 'x2')]),]
is used to drop duplicated rows on selected columns. It's better that my can support duplicated(), and subsetting a DataFrame using the result of duplicated().
Attachments
Issue Links
- relates to
-
SPARK-12337 Implement dropDuplicates() method of DataFrame in SparkR
- Resolved