Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3452

Auto-reduce parallelism calculation can overflow with large inputs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.2, 0.9.0, 0.8.5
    • None
    • None

    Description

      Overflow can occur when the numTasks is high (say 45000) and outputSize is high (say 311TB) and slow start is set to 1.0.

      ShuffleVertexManager
          for (Map.Entry<String, SourceVertexInfo> vInfo : getBipartiteInfo()) {
            SourceVertexInfo srcInfo = vInfo.getValue();
            if (srcInfo.numTasks > 0 && srcInfo.numVMEventsReceived > 0) {
              // this assumes that 1 vmEvent is received per completed task - TEZ-2961
              expectedTotalSourceTasksOutputSize += 
                  (srcInfo.numTasks * srcInfo.outputSize) / srcInfo.numVMEventsReceived;
            }
          }
      

      Attachments

        1. TEZ-3452.1.patch
          0.9 kB
          Jonathan Turner Eagles
        2. TEZ-3452.2.patch
          10 kB
          Jonathan Turner Eagles
        3. TEZ-3452.3.patch
          12 kB
          Jonathan Turner Eagles

        Activity

          People

            jeagles Jonathan Turner Eagles
            jeagles Jonathan Turner Eagles
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: