Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3016

Nimbus gets down when job has large amount of parallelism components

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0
    • Fix Version/s: None
    • Component/s: storm-core
    • Labels:

      Description

      When a job having large amount of parallelism components( total parallelism rises to 5000 for example) been submmited to storm cluster, Nimubs might get crashed, the work flow is as below:

      1)  Nimbus computting assignment

      2) Nimbus sending assignment to zk

      3) When assignment mapping info string is too long due to  total parallelism of job being too large, sending this info to zk will fail (zNode datalength set default is 1M )

      4) Nimbus keeps trying sending this assignment info, after some times, it gives up and crashed, with that happend, the stablity of the cluster will be greatly impacted

        Attachments

        1. nimbus.log
          787 kB
          StaticMian

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              StaticMian StaticMian
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 96h
                96h
                Remaining:
                Remaining Estimate - 96h
                96h
                Logged:
                Time Spent - Not Specified
                Not Specified