Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-1770

SCM crashes when ReplicationManager is trying to re-replicate under replicated containers

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SCM
    • Labels:
    • Target Version/s:
    • Sprint:
      HDDS Biscayne

      Description

      SCM crashes with the following exception when ReplicationManager is trying to re-replicate under replicated containers

      2019-07-08 12:46:36 ERROR ReplicationManager:215 - Exception in Replication Monitor Thread.
      java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is not a member of topology
      at org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.checkAffinityNode(NetworkTopologyImpl.java:767)
      at org.apache.hadoop.hdds.scm.net.NetworkTopologyImpl.chooseRandom(NetworkTopologyImpl.java:407)
      at org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseNode(SCMContainerPlacementRackAware.java:242)
      at org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:168)
      at org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:487)
      at org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:293)
      at java.base/java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4698)
      at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1083)
      at org.apache.hadoop.hdds.scm.container.ReplicationManager.run(ReplicationManager.java:205)
      at java.base/java.lang.Thread.run(Thread.java:834)
      2019-07-08 12:46:36 INFO  ExitUtil:210 - Exiting with status 1: java.lang.IllegalArgumentException: Affinity node /default-rack/aab15e2d07cc is not a member of topology
      2019-07-08 12:46:36 INFO  StorageContainerManagerStarter:51 - SHUTDOWN_MSG: 
      /************************************************************
      SHUTDOWN_MSG: Shutting down StorageContainerManager at 8c763563f672/192.168.112.2
      ************************************************************/
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                nanda Nanda kumar
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: