Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20432

Unioning two identical Streaming DataFrames fails during attribute resolution

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 2.1.0
    • None
    • Structured Streaming
    • None

    Description

      To reproduce, try unioning two identical Kafka Streams:

      df = spark.readStream.format("kafka")... \
        .select(from_json(col("value").cast("string"), simpleSchema).alias("parsed_value"))
      
      df.union(df).writeStream...
      

      Exception is confusing:

      org.apache.spark.sql.AnalysisException: resolved attribute(s) value#526 missing from value#511,topic#512,partition#513,offset#514L,timestampType#516,key#510,timestamp#515 in operator !Project [jsontostructs(...) AS parsed_value#357];
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              brkyvz Burak Yavuz
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: