Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3602

loadaware shuffle can overload local worker

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0, 2.1.0
    • Fix Version/s: 2.2.0, 2.1.1
    • Component/s: None

      Description

      We were seeing a worker overloaded and tuples timing out with loadaware shuffle enabled.  From investigating, we found that the code allows switching from Host local to Worker local if the load average is lower than the low water mark.  It really should be checking the load on the worker instead. 

       

      What's happening is the worker is overloaded with tons of idle host local tasks, so it switches to HOST_LOCAL.  Then the calculation across all the host tasks is below the low water mark and it immediately switches back to the overloaded worker local task.

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                agresch Aaron Gresch
                Reporter:
                agresch Aaron Gresch
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h