Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15999

Wrong/Missing information for Spark UI/REST interface

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • 1.5.0
    • None
    • Documentation, DStreams
    • None
    • CDH5.5.2, Spark 1.5.0

    Description

      Spark Monitoring documentation

      https://spark.apache.org/docs/1.5.0/monitoring.html

      You can access this interface by simply opening http://<driver-node>:4040 in a web browser. If multiple SparkContexts are running on the same host, they will bind to successive ports beginning with 4040 (4041, 4042, etc).

      This statement is very confusing and doesn't apply at all in spark streaming jobs(unless i am missing something)

      Same is the case with REST API calls.

      REST API
      In addition to viewing the metrics in the UI, they are also available as JSON. This gives developers an easy way to create new visualizations and monitoring tools for Spark. The JSON is available for both running applications, and in the history server. The endpoints are mounted at /api/v1. Eg., for the history server, they would typically be accessible at http://<server-url>:18080/api/v1, and for a running application, at http://localhost:4040/api/v1.

      I am running spark streaming job in CDH-5.5.2 Spark version 1.5.0
      and nowhere on driver node, executor node for running/live application i am able to call rest service.
      My spark streaming jobs running in yarn cluster mode
      --master yarn-cluster

      However for historyServer
      i am able to call REST service and can pull up json messages
      using the URL
      http://historyServer:18088/api/v1/applications

      [ {
        "id" : "application_1463099418950_11465",
        "name" : "PySparkShell",
        "attempts" : [ {
          "startTime" : "2016-06-15T15:28:32.460GMT",
          "endTime" : "2016-06-15T19:01:39.100GMT",
          "sparkUser" : "abc",
          "completed" : true
        } ]
      }, {
        "id" : "application_1463099418950_11635",
        "name" : "DataProcessor-ETL.ETIME",
        "attempts" : [ {
          "attemptId" : "1",
          "startTime" : "2016-06-15T18:56:04.413GMT",
          "endTime" : "2016-06-15T18:58:00.022GMT",
          "sparkUser" : "abc",
          "completed" : true
        } ]
      }, 
      

      Besides following description pointing to a broken link to http://metrics.codahale.com/

      Spark has a configurable metrics system based on the Coda Hale Metrics Library.

      Attachments

        Activity

          People

            Unassigned Unassigned
            faisal.siddiqui Faisal
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: