Description
Strongly typing the return values of read.text as Dataset[String] breaks when trying to load a partitioned table (or any table where the path looks partitioned)
Seq((1, "test")) .toDF("a", "b") .write .format("text") .partitionBy("a") .save("/home/michael/text-part-bug") sqlContext.read.text("/home/michael/text-part-bug")
org.apache.spark.sql.AnalysisException: Try to map struct<value:string,a:int> to Tuple1, but failed as the number of fields does not line up. - Input schema: struct<value:string,a:int> - Target schema: struct<value:string>; at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.org$apache$spark$sql$catalyst$encoders$ExpressionEncoder$$fail$1(ExpressionEncoder.scala:265) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.validate(ExpressionEncoder.scala:279) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:197) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:168) at org.apache.spark.sql.Dataset$.apply(Dataset.scala:57) at org.apache.spark.sql.Dataset.as(Dataset.scala:357) at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:450)