Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-1447

Provide a mechanism for InputInitializers to know about interesting Vertex state changes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.5.0
    • 0.5.1
    • None
    • None

    Description

      I'm trying to do dynamic partition pruning through input initializer events in Hive. That means that the initializer of a table scan vertex has to receive events from all tasks in another vertex (which contain the pruning info) before generating tasks to run.

      The problem with the current API I ran into:

      getNumTasks: I'm currently using a busy loop to wait for the num tasks for a vertex to be decided (-1 -> x). There's no way around it, because it's the only way to find out what number of events to expect (0 is a valid number of tasks - so I can't wait for the first to complete).

      With auto-reducer parallelism I have to employ another busy loop. Because I might be initially expecting 10 events, which later get's knocked down to 5. Since there's no event associated with this, I have to periodically check whether I have enough events.

      Versioning: Events have a version number, but I don't know which task they are coming from. Thus I can't de-dup events.

      Attachments

        1. TEZ-1447.1.wip.txt
          41 kB
          Siddharth Seth
        2. TEZ-1447.2.txt
          58 kB
          Siddharth Seth
        3. TEZ-1447.3.txt
          67 kB
          Siddharth Seth
        4. TEZ-1447.4.txt
          66 kB
          Siddharth Seth
        5. TEZ-1447.5.txt
          67 kB
          Siddharth Seth
        6. TEZ-1447.5.addendum.txt
          4 kB
          Siddharth Seth

        Issue Links

          Activity

            People

              sseth Siddharth Seth
              hagleitn Gunther Hagleitner
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: