Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12562

DataFrame.write.format("text") requires the column name to be called value

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.6.1, 2.0.0
    • Component/s: SQL
    • Labels:
      None

      Description

      We should support writing any DataFrame that has a single string column, independent of the name.

      wiki.select("text")
        .limit(10000)
        .write
        .format("text")
        .mode("overwrite")
        .save("/home/michael/wiki.txt")
      
      org.apache.spark.sql.AnalysisException: cannot resolve 'value' given input columns text;
      	at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:60)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
      	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:319)
      	at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:319)
      	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:318)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionUp$1(QueryPlan.scala:107)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2(QueryPlan.scala:117)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2$1.apply(QueryPlan.scala:121)
      	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
      	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
      	at scala.collection.immutable.List.foreach(List.scala:318)
      	at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
      	at scala.collection.AbstractTraversable.map(Traversable.scala:105)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2(QueryPlan.scala:121)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$2.apply(QueryPlan.scala:125)
      	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
      	at scala.collection.Iterator$class.foreach(Iterator.scala:727)
      	at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
      	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
      	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
      	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
      	at scala.collection.AbstractIterator.to(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
      	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
      	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
      	at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
      	at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:125)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:57)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50)
      	at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:105)
      	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50)
      	at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44)
      	at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
      	at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
      	at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
      	at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:106)
      	at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
      	at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
      	at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
      	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
      	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
      	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
      	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
      	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
      	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
      	at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:256)
      	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
      	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
      

        Attachments

          Activity

            People

            • Assignee:
              xguo27 Xiu (Joe) Guo
              Reporter:
              marmbrus Michael Armbrust
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: