Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3209 Support for fair custom data routing
  3. TEZ-4508

Allow the FAIR_PARALLELISM mode to accept multiple source vertices

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 0.10.2
    • None
    • None
    • None

    Description

      Currently, FairShuffleVertexManager with FAIR_PARALLELISM fails when two or more edges are configured with SCATTER_GATHER.
      https://github.com/apache/tez/blob/rel/release-0.10.2/tez-runtime-library/src/main/java/org/apache/tez/dag/library/vertexmanager/FairShuffleVertexManager.java#L198-L204

      Looking at TEZ-3500, we assume the situation with such multiple sources happens when the vertex performs JOIN. In that case, I agree we definitely need more than the current FairShuffleVertexManager.

      However, the current fair routing with multiple sources sufficiently works when the sources are symmetric. One case I assume is UNION ALL + bucketed INSERT.

      Attachments

        Issue Links

          Activity

            People

              okumin Shohei Okumiya
              okumin Shohei Okumiya
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 50m
                  50m