Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-2736

o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with key

    Details

      Description

      Sometimes, after our topologies have been running for a while, Zookeeper does not respond within an appropriate time and we see

      2017-08-16 10:18:38.859 o.a.s.zookeeper [INFO] ip-10-181-20-70.ec2.internal lost leadership.
      2017-08-16 10:21:31.144 o.a.s.zookeeper [INFO] ip-10-181-20-70.ec2.internal gained leadership, checking if it has all the topology code locally.
      2017-08-16 10:21:46.201 o.a.s.zookeeper [INFO] Accepting leadership, all active topology found localy.
      

      That's fine, and we probably need to allocate more resources. But after a new leader is chosen, we then see:

      o.a.s.b.BlobStoreUtils [ERROR] Could not update the blob with key<key>
      

      over and over.

      I can't figure out yet how to cause the conditions that lead to Zookeeper becoming unresponsive, but it is possible to reproduce the BlobStoreUtils error by restarting Zookeeper.

      The problem, I think, is that the loop here never executes because the nimbusInfos list is empty. If I add a check similar to this for a node which exists but has no children, the error goes away.

        Attachments

          Activity

            People

            • Assignee:
              hmcc Heather McCartney
              Reporter:
              hmcc Heather McCartney
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m