I am using SparkR from RStudio, and I ran into an error with the join function that I recreated with a smaller example:
Running this code, I encountered the error:
However, if I changed the joinType to "leftsemi",
I would get the error:
Since the join function in R appears to invoke a Java method, I went into DataFrame.R and changed the code on line 1374 and line 1378 to change the "semijoin" to "leftsemi" to match the Java function's parameters. These also make the R joinType accepted values match those of Scala's.
This fixed the issue, but I'm not sure if this solution breaks hive compatibility or causes other issues, but I can submit a pull request to change this