Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-3283

Receivers sometimes do not get spread out to multiple nodes

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • DStreams
    • None

    Description

      The probable reason this happens is because the JobGenerator and JobScheduler start generating jobs with tasks. When the ReceiverTracker submits the task containing receivers, the tasks get assigned according to empty slots, which may be instantaneously available on one node, instead of all the nodes.

      The original behavior was that the jobs started only after the receivers are started, thus ensuring that all the slots are free and the receivers are spread evenly across all the nodes.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tdas Tathagata Das
            tdas Tathagata Das
            Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment