Details
-
Improvement
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
None
Description
As part of Spark 2.0, we want to create a stable API foundation for Dataset to become the main user-facing API in Spark. This ticket tracks various tasks related to that.
The main high level changes are:
1. Merge Dataset/DataFrame
2. Create a more natural entry point for Dataset (SQLContext/HiveContext are not ideal because of the name "SQL"/"Hive", and "SparkContext" is not ideal because of its heavy dependency on RDDs)
3. First class support for sessions
4. First class support for some system catalog
See the design doc for more details.