Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23616

Streaming self-join using SQL throws resolution exceptions

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.3.0
    • Fix Version/s: None
    • Component/s: Structured Streaming
    • Labels:
      None

      Description

      Reported on the dev list.

      import org.apache.spark.sql.streaming.Trigger 
      val jdf = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe", "join_test").option("startingOffsets", "earliest").load();
      
      jdf.createOrReplaceTempView("table")
      
      val resultdf = spark.sql("select * from table as x inner join table as y on x.offset=y.offset")
      
      resultdf.writeStream.outputMode("update").format("console").option("truncate", false).trigger(Trigger.ProcessingTime(1000)).start()
      

      This is giving the following error

      org.apache.spark.sql.AnalysisException: cannot resolve '`x.offset`' given input columns: [x.value, x.offset, x.key, x.timestampType, x.topic, x.timestamp, x.partition]; line 1 pos 50;
      'Project [*]
      +- 'Join Inner, ('x.offset = 'y.offset)
       :- SubqueryAlias x
       : +- SubqueryAlias table
       : +- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@15f3f9cf,kafka,List(),None,List(),None,Map(startingOffsets -> earliest, subscribe -> join_test, kafka.bootstrap.servers -> localhost:9092),None), kafka, [key#28, value#29, topic#30, partition#31, offset#32L, timestamp#33, timestampType#34]
       +- SubqueryAlias y
       +- SubqueryAlias table
       +- StreamingRelation DataSource(org.apache.spark.sql.SparkSession@15f3f9cf,kafka,List(),None,List(),None,Map(startingOffsets -> earliest, subscribe -> join_test, kafka.bootstrap.servers -> localhost:9092),None), kafka, [key#28, value#29, topic#30, partition#31, offset#32L, timestamp#33, timestampType#34]
      

       

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              tdas Tathagata Das
              Reporter:
              tdas Tathagata Das

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment