Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.3.0, 2.3.1, 2.3.2, 2.4.0, 3.0.0
-
None
Description
While using withColumn to add a column to a structured streaming Dataset, I am getting following exception:
org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object, tree: 'timestamp
Following is sample code
final String path = "path_to_input_directory"; final StructType schema = new StructType(new StructField[] { new StructField("word", StringType, false, Metadata.empty()), new StructField("count", DataTypes.IntegerType, false, Metadata.empty()) }); SparkSession sparkSession = SparkSession.builder().appName("StructuredStreamingIssue").master("local").getOrCreate(); Dataset<Row> words = sparkSession.readStream().option("sep", ",").schema(schema).csv(path); Dataset<Row> wordsWithTimestamp = words.withColumn("timestamp", functions.current_timestamp()); // wordsWithTimestamp.explain(true); StreamingQuery query = wordsWithTimestamp.writeStream().outputMode("update").option("truncate", "false").format("console").trigger(Trigger.ProcessingTime("2 seconds")).start(); query.awaitTermination();
Following are the contents of the file present at path
a,2 c,4 d,2 r,1 t,9
This seems working with 2.2.0 release, but not with 2.3.0 and 2.4.0