Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2050

Revise Authenticator interface

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.21.0
    • 0.23.0
    • None
    • None
    • Mesosphere Q1 Sprint 3 - 2/20, Mesosphere Q1 Sprint 4 - 3/6, Mesosphere Q1 Sprint 5 - 3/20, Mesosphere Q1 Sprint 6 - 4/3

    Description

      Observed this on ASF CI:

      Basically, as part of the recent Auth refactor for modules, the loading of secrets is being done once per Authenticator Process instead of once in the Master. Since, InMemoryAuxProp plugin manipulates static variables (e.g, 'properties') it results in SEGFAULT when one Authenticator (e.g., for slave) does load() while another Authenticator (e.g., for framework) does lookup(), as both these methods manipulate static 'properties'.

      [ RUN      ] MasterTest.LaunchDuplicateOfferTest
      Using temporary directory '/tmp/MasterTest_LaunchDuplicateOfferTest_XEBbvp'
      I1104 03:37:55.523553 28363 leveldb.cpp:176] Opened db in 2.270387ms
      I1104 03:37:55.524250 28363 leveldb.cpp:183] Compacted db in 662527ns
      I1104 03:37:55.524276 28363 leveldb.cpp:198] Created db iterator in 4964ns
      I1104 03:37:55.524284 28363 leveldb.cpp:204] Seeked to beginning of db in 702ns
      I1104 03:37:55.524291 28363 leveldb.cpp:273] Iterated through 0 keys in the db in 450ns
      I1104 03:37:55.524333 28363 replica.cpp:741] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
      I1104 03:37:55.524852 28384 recover.cpp:437] Starting replica recovery
      I1104 03:37:55.525188 28384 recover.cpp:463] Replica is in EMPTY status
      I1104 03:37:55.526577 28378 replica.cpp:638] Replica in EMPTY status received a broadcasted recover request
      I1104 03:37:55.527135 28378 master.cpp:318] Master 20141104-033755-3176252227-49988-28363 (proserpina.apache.org) started on 67.195.81.189:49988
      I1104 03:37:55.527180 28378 master.cpp:364] Master only allowing authenticated frameworks to register
      I1104 03:37:55.527191 28378 master.cpp:369] Master only allowing authenticated slaves to register
      I1104 03:37:55.527217 28378 credentials.hpp:36] Loading credentials for authentication from '/tmp/MasterTest_LaunchDuplicateOfferTest_XEBbvp/credentials'
      I1104 03:37:55.527451 28378 master.cpp:408] Authorization enabled
      I1104 03:37:55.528081 28384 master.cpp:126] No whitelist given. Advertising offers for all slaves
      I1104 03:37:55.528548 28383 recover.cpp:188] Received a recover response from a replica in EMPTY status
      I1104 03:37:55.528645 28388 hierarchical_allocator_process.hpp:299] Initializing hierarchical allocator process with master : master@67.195.81.189:49988
      I1104 03:37:55.529233 28388 master.cpp:1258] The newly elected leader is master@67.195.81.189:49988 with id 20141104-033755-3176252227-49988-28363
      I1104 03:37:55.529266 28388 master.cpp:1271] Elected as the leading master!
      I1104 03:37:55.529289 28388 master.cpp:1089] Recovering from registrar
      I1104 03:37:55.529311 28385 recover.cpp:554] Updating replica status to STARTING
      I1104 03:37:55.529500 28384 registrar.cpp:313] Recovering registrar
      I1104 03:37:55.530037 28383 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 497965ns
      I1104 03:37:55.530083 28383 replica.cpp:320] Persisted replica status to STARTING
      I1104 03:37:55.530335 28387 recover.cpp:463] Replica is in STARTING status
      I1104 03:37:55.531343 28381 replica.cpp:638] Replica in STARTING status received a broadcasted recover request
      I1104 03:37:55.531739 28384 recover.cpp:188] Received a recover response from a replica in STARTING status
      I1104 03:37:55.532168 28379 recover.cpp:554] Updating replica status to VOTING
      I1104 03:37:55.532572 28381 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 293974ns
      I1104 03:37:55.532594 28381 replica.cpp:320] Persisted replica status to VOTING
      I1104 03:37:55.532790 28390 recover.cpp:568] Successfully joined the Paxos group
      I1104 03:37:55.533107 28390 recover.cpp:452] Recover process terminated
      I1104 03:37:55.533604 28382 log.cpp:656] Attempting to start the writer
      I1104 03:37:55.534840 28381 replica.cpp:474] Replica received implicit promise request with proposal 1
      I1104 03:37:55.535188 28381 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 321021ns
      I1104 03:37:55.535212 28381 replica.cpp:342] Persisted promised to 1
      I1104 03:37:55.535893 28378 coordinator.cpp:230] Coordinator attemping to fill missing position
      I1104 03:37:55.537318 28392 replica.cpp:375] Replica received explicit promise request for position 0 with proposal 2
      I1104 03:37:55.537719 28392 leveldb.cpp:343] Persisting action (8 bytes) to leveldb took 366858ns
      I1104 03:37:55.537746 28392 replica.cpp:676] Persisted action at 0
      I1104 03:37:55.538872 28391 replica.cpp:508] Replica received write request for position 0
      I1104 03:37:55.538952 28391 leveldb.cpp:438] Reading position from leveldb took 36647ns
      I1104 03:37:55.539482 28391 leveldb.cpp:343] Persisting action (14 bytes) to leveldb took 483191ns
      I1104 03:37:55.539510 28391 replica.cpp:676] Persisted action at 0
      I1104 03:37:55.540045 28379 replica.cpp:655] Replica received learned notice for position 0
      I1104 03:37:55.540402 28379 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 299913ns
      I1104 03:37:55.540429 28379 replica.cpp:676] Persisted action at 0
      I1104 03:37:55.540455 28379 replica.cpp:661] Replica learned NOP action at position 0
      I1104 03:37:55.541205 28385 log.cpp:672] Writer started with ending position 0
      I1104 03:37:55.542579 28378 leveldb.cpp:438] Reading position from leveldb took 33358ns
      I1104 03:37:55.545608 28377 registrar.cpp:346] Successfully fetched the registry (0B) in 16.064768ms
      I1104 03:37:55.545778 28377 registrar.cpp:445] Applied 1 operations in 31511ns; attempting to update the 'registry'
      I1104 03:37:55.548598 28382 log.cpp:680] Attempting to append 143 bytes to the log
      I1104 03:37:55.548821 28385 coordinator.cpp:340] Coordinator attempting to write APPEND action at position 1
      I1104 03:37:55.549759 28385 replica.cpp:508] Replica received write request for position 1
      I1104 03:37:55.550269 28385 leveldb.cpp:343] Persisting action (162 bytes) to leveldb took 475014ns
      I1104 03:37:55.550300 28385 replica.cpp:676] Persisted action at 1
      I1104 03:37:55.550984 28382 replica.cpp:655] Replica received learned notice for position 1
      I1104 03:37:55.551311 28382 leveldb.cpp:343] Persisting action (164 bytes) to leveldb took 304432ns
      I1104 03:37:55.551332 28382 replica.cpp:676] Persisted action at 1
      I1104 03:37:55.551349 28382 replica.cpp:661] Replica learned APPEND action at position 1
      I1104 03:37:55.552196 28392 registrar.cpp:490] Successfully updated the 'registry' in 6.35904ms
      I1104 03:37:55.552338 28392 registrar.cpp:376] Successfully recovered registrar
      I1104 03:37:55.552423 28385 log.cpp:699] Attempting to truncate the log to 1
      I1104 03:37:55.552577 28382 master.cpp:1116] Recovered 0 slaves from the Registry (105B) ; allowing 10mins for slaves to re-register
      I1104 03:37:55.552647 28378 coordinator.cpp:340] Coordinator attempting to write TRUNCATE action at position 2
      I1104 03:37:55.553383 28391 replica.cpp:508] Replica received write request for position 2
      I1104 03:37:55.553772 28391 leveldb.cpp:343] Persisting action (16 bytes) to leveldb took 357114ns
      I1104 03:37:55.553797 28391 replica.cpp:676] Persisted action at 2
      I1104 03:37:55.554234 28379 replica.cpp:655] Replica received learned notice for position 2
      I1104 03:37:55.554529 28379 leveldb.cpp:343] Persisting action (18 bytes) to leveldb took 273827ns
      I1104 03:37:55.554565 28379 leveldb.cpp:401] Deleting ~1 keys from leveldb took 16455ns
      I1104 03:37:55.554580 28379 replica.cpp:676] Persisted action at 2
      I1104 03:37:55.554596 28379 replica.cpp:661] Replica learned TRUNCATE action at position 2
      I1104 03:37:55.566700 28385 slave.cpp:169] Slave started on 2)@67.195.81.189:49988
      I1104 03:37:55.566766 28385 credentials.hpp:84] Loading credential for authentication from '/tmp/MasterTest_LaunchDuplicateOfferTest_He08c1/credential'
      I1104 03:37:55.566946 28385 slave.cpp:276] Slave using credential for: test-principal
      I1104 03:37:55.567327 28385 slave.cpp:289] Slave resources: cpus(*):2; mem(*):1024; disk(*):3.70122e+06; ports(*):[31000-32000]
      I1104 03:37:55.567436 28385 slave.cpp:318] Slave hostname: proserpina.apache.org
      I1104 03:37:55.567458 28385 slave.cpp:319] Slave checkpoint: false
      W1104 03:37:55.567471 28385 slave.cpp:321] Disabling checkpointing is deprecated and the --checkpoint flag will be removed in a future release. Please avoid using this flag
      I1104 03:37:55.568488 28392 state.cpp:33] Recovering state from '/tmp/MasterTest_LaunchDuplicateOfferTest_He08c1/meta'
      I1104 03:37:55.568739 28380 status_update_manager.cpp:197] Recovering status update manager
      I1104 03:37:55.569128 28387 slave.cpp:3478] Finished recovery
      I1104 03:37:55.569718 28388 status_update_manager.cpp:171] Pausing sending status updates
      I1104 03:37:55.569773 28384 slave.cpp:602] New master detected at master@67.195.81.189:49988
      I1104 03:37:55.569820 28384 slave.cpp:665] Authenticating with master master@67.195.81.189:49988
      I1104 03:37:55.570000 28386 authenticatee.hpp:133] Creating new client SASL connection
      I1104 03:37:55.570137 28384 slave.cpp:638] Detecting new master
      I1104 03:37:55.570248 28385 master.cpp:3869] Authenticating slave(2)@67.195.81.189:49988
      I1104 03:37:55.570291 28385 master.cpp:3880] Using default CRAM-MD5 authenticator
      I1104 03:37:55.570592 28377 authenticator.hpp:169] Creating new server SASL connection
      I1104 03:37:55.570808 28377 authenticatee.hpp:224] Received SASL authentication mechanisms: CRAM-MD5
      I1104 03:37:55.570826 28363 sched.cpp:137] Version: 0.21.0
      I1104 03:37:55.570837 28377 authenticatee.hpp:250] Attempting to authenticate with mechanism 'CRAM-MD5'
      I1104 03:37:55.571281 28386 authenticator.hpp:275] Received SASL authentication start
      I1104 03:37:55.571321 28386 authenticator.hpp:397] Authentication requires more steps
      I1104 03:37:55.571599 28391 sched.cpp:233] New master detected at master@67.195.81.189:49988
      I1104 03:37:55.571635 28391 sched.cpp:283] Authenticating with master master@67.195.81.189:49988
      I1104 03:37:55.571771 28378 authenticatee.hpp:133] Creating new client SASL connection
      I1104 03:37:55.571974 28377 authenticatee.hpp:270] Received SASL authentication step
      I1104 03:37:55.572016 28380 master.cpp:3869] Authenticating scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
      I1104 03:37:55.572082 28377 authenticator.hpp:303] Received SASL authentication step
      I1104 03:37:55.572103 28380 master.cpp:3880] Using default CRAM-MD5 authenticator
      I1104 03:37:55.572108 28377 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
      I1104 03:37:55.572119 28377 auxprop.cpp:171] Looking up auxiliary property '*userPassword'
      *** Aborted at 1415072275 (unix time) try "date -d @1415072275" if you are using GNU date ***
      I1104 03:37:55.572304 28387 authenticator.hpp:169] Creating new server SASL connection
      PC: @     0x2aae9cc945cb (unknown)
      I1104 03:37:55.572639 28380 authenticatee.hpp:224] Received SASL authentication mechanisms: CRAM-MD5
      I1104 03:37:55.572667 28380 authenticatee.hpp:250] Attempting to authenticate with mechanism 'CRAM-MD5'
      I1104 03:37:55.572756 28391 authenticator.hpp:275] Received SASL authentication start
      I1104 03:37:55.572796 28391 authenticator.hpp:397] Authentication requires more steps
      I1104 03:37:55.572860 28391 authenticatee.hpp:270] Received SASL authentication step
      I1104 03:37:55.572939 28387 authenticator.hpp:303] Received SASL authentication step
      I1104 03:37:55.572958 28387 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false 
      I1104 03:37:55.572967 28387 auxprop.cpp:171] Looking up auxiliary property '*userPassword'
      I1104 03:37:55.572999 28387 auxprop.cpp:171] Looking up auxiliary property '*cmusaslsecretCRAM-MD5'
      I1104 03:37:55.573015 28387 auxprop.cpp:99] Request to lookup properties for user: 'test-principal' realm: 'proserpina.apache.org' server FQDN: 'proserpina.apache.org' SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true 
      I1104 03:37:55.573027 28387 auxprop.cpp:121] Skipping auxiliary property '*userPassword' since SASL_AUXPROP_AUTHZID == true
      I1104 03:37:55.573034 28387 auxprop.cpp:121] Skipping auxiliary property '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
      I1104 03:37:55.573045 28387 authenticator.hpp:389] Authentication success
      I1104 03:37:55.573132 28390 authenticatee.hpp:310] Authentication success
      I1104 03:37:55.573168 28388 master.cpp:3930] Successfully authenticated principal 'test-principal' at scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
      I1104 03:37:55.573325 28390 sched.cpp:357] Successfully authenticated with master master@67.195.81.189:49988
      I1104 03:37:55.573348 28390 sched.cpp:476] Sending registration request to master@67.195.81.189:49988
      I1104 03:37:55.573606 28388 master.cpp:1378] Received registration request for framework 'default' at scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
      I1104 03:37:55.573689 28388 master.cpp:1337] Authorizing framework principal 'test-principal' to receive offers for role '*'
      I1104 03:37:55.574188 28388 master.cpp:1442] Registering framework 20141104-033755-3176252227-49988-28363-0000 (default) at scheduler-3dc2b4d5-b421-4e73-9088-32ed103e743d@67.195.81.189:49988
      I1104 03:37:55.574515 28386 hierarchical_allocator_process.hpp:329] Added framework 20141104-033755-3176252227-49988-28363-0000
      I1104 03:37:55.574547 28386 hierarchical_allocator_process.hpp:697] No resources available to allocate!
      I1104 03:37:55.574560 28386 hierarchical_allocator_process.hpp:659] Performed allocation for 0 slaves in 19409ns
      I1104 03:37:55.574631 28388 sched.cpp:407] Framework registered with 20141104-033755-3176252227-49988-28363-0000
      I1104 03:37:55.574676 28388 sched.cpp:421] Scheduler::registered took 20198ns
      *** SIGSEGV (@0xfffffffffffffff8) received by PID 28363 (TID 0x2aae9f399700) from PID 18446744073709551608; stack trace: ***
          @     0x2aae9d409340 (unknown)
          @     0x2aae9cc945cb (unknown)
          @           0x98e4f0 _ZNSt10_List_nodeISsEC2IJRKSsEEEDpOT_
          @           0x9788c2 _ZN9__gnu_cxx13new_allocatorISt10_List_nodeISsEE9constructIS2_JRKSsEEEvPT_DpOT0_
          @           0x956696 std::list<>::_M_create_node<>()
          @           0x9376a4 std::list<>::_M_insert<>()
          @           0x978829 _ZNSt4listISsSaISsEE12emplace_backIJRKSsEEEvDpOT_
          @           0x956534 std::list<>::_M_initialize_dispatch<>()
          @           0x9374e1 std::list<>::list()
          @           0xb0f80f mesos::internal::cram_md5::Property::Property()
          @     0x2aae9a6779f8 std::_List_node<>::_List_node<>()
          @     0x2aae9a67664e _ZN9__gnu_cxx13new_allocatorISt10_List_nodeIN5mesos8internal8cram_md58PropertyEEE9constructIS6_JRKS5_EEEvPT_DpOT0_
          @     0x2aae9a674876 _ZNSt4listIN5mesos8internal8cram_md58PropertyESaIS3_EE14_M_create_nodeIJRKS3_EEEPSt10_List_nodeIS3_EDpOT_
          @     0x2aae9a6721c6 _ZNSt4listIN5mesos8internal8cram_md58PropertyESaIS3_EE9_M_insertIJRKS3_EEEvSt14_List_iteratorIS3_EDpOT_
          @     0x2aae9a670326 std::list<>::push_back()
          @     0x2aae9a66e704 Multimap<>::get()
          @     0x2aae9a66b561 mesos::internal::cram_md5::InMemoryAuxiliaryPropertyPlugin::lookup()
          @     0x2aae9a66aa3d mesos::internal::cram_md5::InMemoryAuxiliaryPropertyPlugin::lookup()
          @     0x2aae9c7b9e29 (unknown)
          @     0x2aae9c7babe5 (unknown)
          @     0x2aaea2b8252e (unknown)
          @     0x2aaea2b82929 (unknown)
          @     0x2aae9c7c6628 (unknown)
          @           0xb0c482 mesos::internal::cram_md5::CRAMMD5AuthenticatorProcess::step()
          @           0xb135e9 ProtobufProcess<>::_handler1<>()
          @           0xb25229 _ZNSt5_BindIFPFvPN5mesos8internal8cram_md527CRAMMD5AuthenticatorProcessEMS3_FvRKSsEMNS1_25AuthenticationStepMessageEKFS6_vERKN7process4UPIDES6_ES4_S8_SB_St12_PlaceholderILi1EESI_ILi2EEEE6__callIvJSF_S6_EJLm0ELm1ELm2ELm3ELm4EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
          @           0xb2078c _ZNSt5_BindIFPFvPN5mesos8internal8cram_md527CRAMMD5AuthenticatorProcessEMS3_FvRKSsEMNS1_25AuthenticationStepMessageEKFS6_vERKN7process4UPIDES6_ES4_S8_SB_St12_PlaceholderILi1EESI_ILi2EEEEclIJSF_S6_EvEET0_DpOT_
          @           0xb1c33b std::_Function_handler<>::_M_invoke()
          @           0xb2bcc5 std::function<>::operator()()
          @           0xb2ba8e ProtobufProcess<>::visit()
          @     0x2aae9a84120a process::MessageEvent::visit()
          @           0x904766 process::ProcessBase::serve()
      
      

      Attachments

        Activity

          People

            tillt Till Toenshoff
            vinodkone Vinod Kone
            Adam B Adam B
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: