Uploaded image for project: 'Apache Storm'
  1. Apache Storm
  2. STORM-4077

Worker being reassigned when Nimbus leadership changes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.6.1
    • 2.7.0
    • None
    • None

    Description

      Hey guys, I'm using Storm v2.6.1 and every time I restart the nimbus leader (currently I have 3 for high availability) the workers get reassigned and this is a bad behaviour as every topology will have no workers running for a certain period(until new workers are assigned) due to a Nimbus leadership change.

      Update:

      Essentially, by using the modTime as the version, we have found that, while using theĀ LocalFsBlobStoreFile, everytime the the Nimbus leader goes down the following occurs:

      1. Nimbus (1) leader goes down and a new Nimbus (2) picks up the leadership.
      2. If blobs in Nimbus (2) have a different modTime workers are restarted (even though they might be the same).
      3. Nimbus (1) comes back up, syncs the blobs in the startup and updates the modTime, as it downloads the blobs again.
      4. If Nimbus (2) leader goes down, all the workers will be restarted again as Nimbus (1) has new modTime again.
      5. This can be repeated endless as the modTime will always be different in each Nimbus leader.

      We suggest a new method that obtains the file version:

      public abstract class BlobStoreFile {
          public abstract long getModTime() throws IOException;
      
          public long getVersion() throws IOException {
              return getModTime();
          }
      }

      And defaults to the current approach if not implemented and the version of the file would be something in the lines:

      public long getVersion() throws IOException {
          byte[] bytes = DigestUtils.sha1(new FileInputStream(path));
          return Arrays.hashCode(bytes);
      } 

      Soon, I'll open the PR and link it here.

      Attachments

        Activity

          People

            Unassigned Unassigned
            paxadax Pedro Azevedo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: