Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3999

Extend VertexManagerPlugin interface to allow for relevant events notification

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      This is an umbrella task of TEZ-3997

      *For concurrent connection, the downstream and upstream vertices would be running concurrently, and in some cases, they would be scheduled at the same time as well, such as (sub-graph) gang scheduling. However, *this is not always true. In the example in Fig. 1, tasks in PS vertex should be running before tasks in W vertex should be scheduled. Since otherwise if the resource requests for PS cannot be fulfilled first, W will be spinning in vain. In other examples, as long as part of tasks in upstream vertex are running, we can start scheduling downstream tasks.

       

      In other words, if we put this into the context of existing interface/implementation of VertexMangerPlugin, we can see strong duality of “OnSourceTaskRunning” for concurrent connection vs the “OnSourceTaskCompleted” for (existing) sequential connection. Therefore, we propose an addition of “onConcurrentSourceTaskRunning(TaskAttemptIdentifer attempt)” interface to the VertexManager Plugin, with default implementation being not supported.

      This change will also include the logic to add source task running event and to send such events to downstream vertices. To reduce unnecessary event traffic, we will limit the sending of such events to CONCURRENT edge, and when the ConcurrentSchedulingType is specified to be  SOURCE_TASK_STARTED .

      Attachments

        Activity

          People

            yingdachen Yingda Chen
            yingdachen Yingda Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: