Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2998

RebalancingDuringElectionStormTest.RoundRobin sometimes crashes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.10.0, 1.10.1
    • Fix Version/s: n/a
    • Component/s: test
    • Labels:
      None

      Description

      I saw the RebalancingDuringElectionStormTest.RoundRobin tests crashed in DEBUG configuration with the following error:

      F1116 06:53:57.325479 11078 quorum_util.cc:167] Check failed: RaftPeerPB::NON_PARTICIPANT != GetConsensusRole(peer_uuid, cstate) (3 vs. 3) Peer fe4321fd981c466d86cd1fe2949868dc << not a participant in current_term: 25 leader_uuid: "422db10def4d4c95a5a5bfd2cb787aa2" committed_config { opid_index: 77 OBSOLETE_local: false peers { permanent_uuid: "f6d99d2a6f5542428e5e797972a0f53e" member_type: VOTER last_known_addr { host: "127.25.232.67" port: 41397 } } peers { permanent_uuid: "4084fddb6afb4aed80b27fc4bee3de1f" member_type: VOTER last_known_addr { host: "127.25.232.68" port: 39941 } } peers { permanent_uuid: "fe4321fd981c466d86cd1fe2949868dc" member_type: VOTER last_known_addr { host: "127.25.232.65" port: 40533 } attrs { replace: true } } peers { permanent_uuid: "422db10def4d4c95a5a5bfd2cb787aa2" member_type: VOTER last_known_addr { host: "127.25.232.70" port: 35983 } attrs { promote: false } } } pending_config { opid_index: 80 OBSOLETE_local: false peers { permanent_uuid: "f6d99d2a6f5542428e5e797972a0f53e" member_type: VOTER last_known_addr { host: "127.25.232.67" port: 41397 } } peers { permanent_uuid: "4084fddb6afb4aed80b27fc4bee3de1f" member_type: VOTER last_known_addr { host: "127.25.232.68" port: 39941 } } peers { permanent_uuid: "422db10def4d4c95a5a5bfd2cb787aa2" member_type: VOTER last_known_addr { host: "127.25.232.70" port: 35983 } attrs { promote: false } } }
      

      The stack trace looked like the following:

          @     0x7f6598afa62d  google::LogMessage::Fail() at ??:0
          @     0x7f6598afc64c  google::LogMessage::SendToLog() at ??:0
          @     0x7f6598afa189  google::LogMessage::Flush() at ??:0
          @     0x7f6598afcfdf  google::LogMessageFatal::~LogMessageFatal() at ??:0
          @     0x7f6599a2c12c  kudu::consensus::GetParticipantRole() at ??:0
          @     0x7f659a549cae  kudu::master::CatalogManager::BuildLocationsForTablet() at ??:0
          @     0x7f6596765d8b  _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E18_E9_M_invokeERKSt9_Any_dataS4_S5_S9_ at ??:0
          @     0x7f659a54a37b  kudu::master::CatalogManager::GetTabletLocations() at ??:0
          @     0x7f659a5de5ea  kudu::master::MasterServiceImpl::GetTabletLocations() at ??:0
          @     0x7f659675e742  _ZZN4kudu6master15MasterServiceIfC1ERK13scoped_refptrINS_12MetricEntityEERKS2_INS_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_ at ??:0
          @     0x7f6596764a9f  _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeERKSt9_Any_dataS4_S5_S9_ at ??:0
          @     0x7f659472cd16  std::function<>::operator()() at ??:0
          @     0x7f659472c547  kudu::rpc::GeneratedServiceIf::Handle() at ??:0
          @     0x7f659472f02e  kudu::rpc::ServicePool::RunThread() at ??:0
          @     0x7f65947303fd  boost::_mfi::mf0<>::operator()() at ??:0
          @     0x7f6594730224  boost::_bi::list1<>::operator()<>() at ??:0
          @     0x7f659473010b  boost::_bi::bind_t<>::operator()() at ??:0
          @     0x7f659473003a  boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
          @     0x7f6599599842  boost::function0<>::operator()() at ??:0
          @     0x7f65995965cb  kudu::Thread::SuperviseThread() at ??:0
          @     0x7f6595ca2184  start_thread at ??:0
          @     0x7f6598104ffd  clone at ??:0
      

      The full log is attached.

        Attachments

        1. rebalancer_tool-test.6.txt.xz
          232 kB
          Alexey Serbin

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                aserbin Alexey Serbin
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: