Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-3716

Nodes underutilized

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.1.0
    • None
    • storm-core
    • None

    Description

      Topologies employing anchored tuples do not distribute across multiple nodes, regardless of the computation demands of the bolts. It works fine on a single node, but when throwing multiple nodes into the mix, only one machine gets pegged. When we disable anchoring, it will distribute across all nodes just fine, pegging each machine appropriately.

      This bug manifests from version 2.1 forward. I first encountered this issue with my own production cluster on an app that does significant NLP computation across hundreds of millions of documents. This topology is fairly complex, so I developed a very simple exemplar that demonstrates the issue with only one spout and bolt. I pushed this demonstration up to github to provide the developers with a mechanism to easily isolate the bug, and maybe provide some workaround. I used gradle to build this simple topology and software and package the results. This code is well documented, so it should be fairly simple to reproduce the issue. I first encountered this issue on 3 32 core nodes, but when I started experimenting, I set up a test cluster with 8 cores, and then I increased each node to 16 cores, and plenty of memory in every case.

      The topology can be accessed from github at https://github.com/cowchipkid/storm-issue.git <https://github.com/cowchipkid/storm-issue.git>.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tomredman@mchsi.com Thomas L Redman
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: