Q1. What are you trying to do?
The main idea is to use the power of scala's macrosses to give developers more convenient and typesafe API to use in join conditions.
Q2. What problem is this proposal NOT designed to solve?
R/Java/Python/DataFrame API is out of scope. The solution is not affecting plan generation too.
Q3. How is it done today, and what are the limits of current practice?
Currently the join condition is specified via strings, which might lead to silly mistakes (typos, incompatible column types etc) and sometimes hard to read (in case when several joins are made and the final type is tuple of tuple of tuples...)
Q4. What is new in your approach and why do you think it will be successful?
Scala macroses can be used to extract the column name directly from lambda (extractor). As a side effect its possible to check the column type and prohibit to build inconsistent join expression (like boolean-timestamp comparison)
Q5. Who cares? If you are successful, what difference will it make?
Mainly scala developers who prefers typesafe code - they would have a more clean and nice API that will make the codebase a bit clearer, especially in case when several chained joins is used
Q6. What are the risks?
The overusage of macrosses may slow down the compilation speed. In additional macrosses are hard to maintain
Q7. How long will it take?
Currently the approach is already implemented as a separate lib that makes a bit more than just gives alternative API (for example abstracts Dataset[T] to F[T] which allows to run some spark-specific code without spark session for testing purposes)
Adaptation of it won't be a hard job, matter of several weeks
Q8. What are the mid-term and final “exams” to check for success?
API convenience is very hard to estimate as its more or less a question of taste
You may find the examples of such 'cleaner' API here
Note that backward and forward compatibility is achieved by introducing a brand-new API without modifying an old one