Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46873

PySpark spark.streams should not recreate new StreamingQueryManager

    XMLWordPrintableJSON

Details

    Description

      In Scala, there is only one streaming query manager for one spark session:

      ```

      scala> spark.streams

      val res0: org.apache.spark.sql.streaming.StreamingQueryManager = org.apache.spark.sql.streaming.StreamingQueryManager@46bb8cba

       

      scala> spark.streams

      val res1: org.apache.spark.sql.streaming.StreamingQueryManager = org.apache.spark.sql.streaming.StreamingQueryManager@46bb8cba

       

      scala> spark.streams

      val res2: org.apache.spark.sql.streaming.StreamingQueryManager = org.apache.spark.sql.streaming.StreamingQueryManager@46bb8cba

       

      scala> spark.streams

      val res3: org.apache.spark.sql.streaming.StreamingQueryManager = org.apache.spark.sql.streaming.StreamingQueryManager@46bb8cba

      ```

       

      In Python, this is currently false:

      ```

      >>> spark.streams

      <pyspark.sql.connect.streaming.query.StreamingQueryManager object at 0x1011f7c10>

      >>> spark.streams

      <pyspark.sql.connect.streaming.query.StreamingQueryManager object at 0x1011f71f0>

      >>> spark.streams

      <pyspark.sql.connect.streaming.query.StreamingQueryManager object at 0x1011f7be0>

      >>> spark.streams

      <pyspark.sql.connect.streaming.query.StreamingQueryManager object at 0x1011f7c40>

      ```

       

      Python should align scala behavior. 

      Attachments

        Issue Links

          Activity

            People

              WweiL Wei Liu
              WweiL Wei Liu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: