Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4213

Bound appContext executor capacity using a configurable property

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.10.0
    • None

    Description

      After TEZ-4170 was merged, appContext executor pool is also used by the RootInputInitializerManager to speed up SplitGeneration.

      However, this executor pool currently has no capacity limit https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java#L624

      The problem occurs when generating splits for larger inputs (thousands or more) is that it can could result to
      java.lang.OutOfMemoryError
      that is also reproducible with a test case.
      https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/dag/RootInputInitializerManager.java#L130

      To avoid such errors, I propose to limit the capacity of this pool to a configurable value that can be for example the number of physical cores by default.

      Attachments

        1. TEZ-4213.01.patch
          6 kB
          Panagiotis Garefalakis
        2. TEZ-4213.02.patch
          13 kB
          Panagiotis Garefalakis
        3. TEZ-4213.03.patch
          13 kB
          Panagiotis Garefalakis
        4. TEZ-4213.04.patch
          13 kB
          Panagiotis Garefalakis
        5. TEZ-4213.05.patch
          13 kB
          Panagiotis Garefalakis
        6. TEZ-4213.06.patch
          13 kB
          Panagiotis Garefalakis
        7. TEZ-4213.07.patch
          14 kB
          Panagiotis Garefalakis
        8. TEZ-4213.08.patch
          14 kB
          Panagiotis Garefalakis

        Issue Links

          Activity

            People

              pgaref Panagiotis Garefalakis
              pgaref Panagiotis Garefalakis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m