Spark 1.x's Dataset API runs into subtle source incompatibility problems for Java 8 and Scala 2.12 users when Spark is built against Scala 2.12. In a nutshell, the current API has overloaded methods whose signatures are ambiguous when resolving calls that use the Java 8 lambda syntax (only if Spark is build against Scala 2.12).
This issue is somewhat subtle, so there's a full writeup at https://docs.google.com/document/d/1P_wmH3U356f079AYgSsN53HKixuNdxSEvo8nw_tgLgM/edit?usp=sharing which describes the exact circumstances under which the current APIs are problematic. The writeup also proposes a solution which involves the removal of certain overloads only in Scala 2.12 builds of Spark and the introduction of implicit conversions for retaining source compatibility.
We don't need to implement any of these changes until we add Scala 2.12 support since the changes must only be applied when building against Scala 2.12 and will be done via traits + shims which are mixed in via per-Scala-version source directories (like how we handle the Scala-version-specific parts of the REPL). For now, this JIRA acts as a placeholder so that the parent JIRA reflects the complete set of tasks which need to be finished for 2.12 support.