Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.8.0
-
None
-
debian-9, centos-6, ubuntu-16.04, ..., macOS
Description
Easy to reproduce for me on macOS but also observed on the ASF CI;
$ ./bin/mesos-tests.sh --gtest_filter="*SchedulerTest.SchedulerFailover*" --gtest_repeat=100 --gtest_break_on_failure --verbose
[...] Repeating all tests (iteration 61) . . . [...] [ RUN ] ContentType/SchedulerTest.SchedulerFailover/1 I0907 11:31:42.409766 311620992 cluster.cpp:173] Creating default 'local' authorizer I0907 11:31:42.411957 110624768 master.cpp:413] Master 4450e893-595f-48c2-9ea2-31325fda2c76 (lobomacpro4.fritz.box) started on 192.168.178.20:54546 I0907 11:31:42.411975 110624768 master.cpp:416] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/private/var/folders/66/mgr662nx7t90lspb7wjg8ctr0000gn/T/aVGDNy/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --root_submissions="true" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/private/var/folders/66/mgr662nx7t90lspb7wjg8ctr0000gn/T/aVGDNy/master" --zk_session_timeout="10secs" I0907 11:31:42.412191 110624768 master.cpp:465] Master only allowing authenticated frameworks to register I0907 11:31:42.412202 110624768 master.cpp:471] Master only allowing authenticated agents to register I0907 11:31:42.412210 110624768 master.cpp:477] Master only allowing authenticated HTTP frameworks to register I0907 11:31:42.412219 110624768 credentials.hpp:37] Loading credentials for authentication from '/private/var/folders/66/mgr662nx7t90lspb7wjg8ctr0000gn/T/aVGDNy/credentials' I0907 11:31:42.412322 110624768 master.cpp:521] Using default 'crammd5' authenticator I0907 11:31:42.412355 110624768 http.cpp:1037] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I0907 11:31:42.412390 110624768 http.cpp:1037] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I0907 11:31:42.412417 110624768 http.cpp:1037] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I0907 11:31:42.412439 110624768 master.cpp:602] Authorization enabled I0907 11:31:42.413738 110624768 master.cpp:2083] Elected as the leading master! I0907 11:31:42.413750 110624768 master.cpp:1638] Recovering from registrar I0907 11:31:42.413913 109551616 registrar.cpp:383] Successfully fetched the registry (0B) in 128us I0907 11:31:42.413962 109551616 registrar.cpp:487] Applied 1 operations in 19755ns; attempting to update the registry I0907 11:31:42.414093 109551616 registrar.cpp:544] Successfully updated the registry in 107008ns I0907 11:31:42.414126 109551616 registrar.cpp:416] Successfully recovered registrar I0907 11:31:42.414232 110624768 master.cpp:1752] Recovered 0 agents from the registry (162B); allowing 10mins for agents to reregister I0907 11:31:42.414614 311620992 scheduler.cpp:189] Version: 1.8.0 I0907 11:31:42.415856 113844224 scheduler.cpp:355] Using default 'basic' HTTP authenticatee I0907 11:31:42.415974 112771072 scheduler.cpp:538] New master detected at master@192.168.178.20:54546 I0907 11:31:42.417650 113844224 http.cpp:1177] HTTP POST for /master/api/v1/scheduler from 192.168.178.20:55273 I0907 11:31:42.417768 113844224 master.cpp:2502] Received subscription request for HTTP framework 'default' I0907 11:31:42.417788 113844224 master.cpp:2155] Authorizing framework principal 'test-principal' to receive offers for roles '{ * }' I0907 11:31:42.417914 113844224 master.cpp:2637] Subscribing framework 'default' with checkpointing disabled and capabilities [ MULTI_ROLE, RESERVATION_REFINEMENT ] I0907 11:31:42.418388 113844224 master.cpp:9883] Adding framework 4450e893-595f-48c2-9ea2-31325fda2c76-0000 (default) with roles { } suppressed I0907 11:31:42.418522 110624768 hierarchical.cpp:306] Added framework 4450e893-595f-48c2-9ea2-31325fda2c76-0000 I0907 11:31:42.419454 311620992 scheduler.cpp:189] Version: 1.8.0 I0907 11:31:42.420704 110088192 scheduler.cpp:355] Using default 'basic' HTTP authenticatee I0907 11:31:42.420807 111161344 scheduler.cpp:538] New master detected at master@192.168.178.20:54546 I0907 11:31:42.422297 113844224 http.cpp:1177] HTTP POST for /master/api/v1/scheduler from 192.168.178.20:55275 I0907 11:31:42.422423 113844224 master.cpp:2502] Received subscription request for HTTP framework 'default' I0907 11:31:42.422446 113844224 master.cpp:2155] Authorizing framework principal 'test-principal' to receive offers for roles '{ * }' I0907 11:31:42.422591 113844224 master.cpp:2637] Subscribing framework 'default' with checkpointing disabled and capabilities [ MULTI_ROLE, RESERVATION_REFINEMENT ] I0907 11:31:42.422608 113844224 master.cpp:7760] Updating framework 4450e893-595f-48c2-9ea2-31325fda2c76-0000 (default) with roles { } suppressed I0907 11:31:42.422904 111161344 master.cpp:1226] Ignoring disconnection for framework 4450e893-595f-48c2-9ea2-31325fda2c76-0000 (default) as it has already reconnected I0907 11:31:42.423132 113844224 scheduler.cpp:512] Re-detecting master I0907 11:31:42.423475 113844224 scheduler.cpp:538] New master detected at master@192.168.178.20:54546 ../../src/tests/scheduler_tests.cpp:251: Failure Failed to wait 15secs for error *** Aborted at 1536312717 (unix time) try "date -d @1536312717" if you are using GNU date *** PC: @ 0x10d891ded testing::UnitTest::AddTestPartResult() *** SIGSEGV (@0x0) received by PID 16639 (TID 0x11292f580) stack trace: *** @ 0x7fff72af7b3d _sigtramp @ 0x1108a1a00 (unknown) @ 0x10d8915e7 testing::internal::AssertHelper::operator=() @ 0x10cf83948 mesos::internal::tests::SchedulerTest_SchedulerFailover_Test::TestBody() @ 0x10d904c4e testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x10d8a9a9b testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x10d8a99c6 testing::Test::Run() @ 0x10d8ab79d testing::TestInfo::Run() @ 0x10d8acddc testing::TestCase::Run() @ 0x10d8bd2cc testing::internal::UnitTestImpl::RunAllTests() @ 0x10d90779e testing::internal::HandleSehExceptionsInMethodIfSupported<>() @ 0x10d8bcceb testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x10d8bcbac testing::UnitTest::Run() @ 0x10c1f52f1 RUN_ALL_TESTS() @ 0x10c1f0c9c main @ 0x7fff7290e0a1 start Segmentation fault: 11
Attachments
Issue Links
- relates to
-
MESOS-9215 SchedulerSubscribeAfterFailoverTimeout times out.
- Open