Affects Version/s: 1.5.0
Fix Version/s: None
I was working with dataframes and I found the intersect() method seems to always return '1'. The RDD's intersection() method does work properly.
Consider this example:
scala> val firstFile = sqlContext.read.parquet("/Users/ramkandasamy/sparkData/2015-07-25/*").select("id").distinct
firstFile: org.apache.spark.sql.DataFrame = [id: string]
res4: Long = 1072046
res5: Long = 1
res6: Long = 1072046
I have tried various different cases, and for some reason, the dataframe's intersect method always returns 1.