[SPARK-15856] Revert API breaking changes made in SQLContext.range - ASF JIRA

XML

Word

Printable

JSON

In Spark 2.0, after unifying Datasets and DataFrames, we made two API breaking changes:

DataFrameReader.text() now returns Dataset[String] instead of DataFrame
SQLContext.range() now returns Dataset[java.lang.Long] instead of DataFrame

However, these two changes introduced several inconsistencies and problems:

spark.read.text() silently discards partitioned columns when reading a partitioned table in text format since Dataset[String] only contains a single field. Users have to use spark.read.format("text").load() to workaround this, which is pretty confusing and error-prone.
All data source shortcut methods in `DataFrameReader` return DataFrame (aka Dataset[Row]) except for DataFrameReader.text().
When applying typed operations over Datasets returned by spark.range(), weird schema changes may happen. Please refer to ~~SPARK-15632~~ for more details.

Due to these reasons, we decided to revert these two changes.

links to

[Github] Pull Request #13604 (cloud-fan)