Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4541

Remove or limit evergrowing DAG collections from DAGAppMaster

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      TEZ-1495 introduced a dag id collection (here) to track all dag ids only to be able to give a different response ( ? )

      https://github.com/apache/tez/blob/f8c2e11d0b469748ea95381e7021266e25e5ac89/tez-dag/src/main/java/org/apache/tez/dag/api/client/DAGClientHandler.java#L101-L111

          if (!currentDAGIdStr.equals(dagIdStr)) {
            if (getAllDagIDs().contains(dagIdStr)) {
              LOG.debug("Looking for finished dagId {} current dag is {}", dagIdStr, currentDAGIdStr);
              throw new DAGNotRunningException("DAG " + dagIdStr + " Not running, current dag is " +
                  currentDAGIdStr);
            } else {
              LOG.warn("Current DAGID : " + currentDAGIdStr + ", Looking for string (not found): " +
                  dagIdStr + ", dagIdObj: " + dagId);
              throw new TezException("Unknown dagId: " + dagIdStr);
            }
          }
      

      I can see that DAGNotRunningException is used by the DAGClientImpl to handle edge cases (infer dag completion if the dag is not present as current dag but present in the dag ids collection), which is fine, so maybe instead of removing this collection we might want to limit its size, e.g. to 500, to make DAGAppMaster respond as expected for a certain amount of time (hence not breaking current contract)

      Attachments

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m