Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38881

PySpark Kinesis Streaming should expose metricsLevel CloudWatch config that is already supported in the Scala/Java APIs

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 3.2.1
    • 3.4.0
    • DStreams, Input/Output, PySpark
    • None

    Description

      This relates to https://issues.apache.org/jira/browse/SPARK-27420 which was merged as part of Spark 3.0.0

      This change is desirable as it further exposes the metricsLevel config parameter that was added for the Scala/Java Spark APIs when working with the Kinesis Streaming integration, and makes it available to the PySpark API as well.

      This change passes all tests, and local testing was done with a development Kinesis stream in AWS, in order to confirm that metrics were no longer being reported to CloudWatch after specifying MetricsLevel.NONE in the PySpark Kinesis streaming context creation, and also worked as it does today when leaving the MetricsLevel parameter out, which would result in a default of DETAILED, with CloudWatch metrics appearing again.

      https://github.com/apache/spark/pull/36201

       

      Attachments

        Activity

          People

            mkman84 Mark Khaitman
            mkman84 Mark Khaitman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: