Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-2043

Framework auth fail with timeout error and never get authenticated

    XMLWordPrintableJSON

Details

    • Mesosphere Sprint 35, Mesosphere Sprint 36, Mesosphere Sprint 37, Mesosphere Sprint 38
    • 5

    Description

      I'm facing this issue in master as of https://github.com/apache/mesos/commit/74ea59e144d131814c66972fb0cc14784d3503d4

      As adam-mesos mentioned in IRC, this sounds similar to MESOS-1866. I'm running 1 master and 1 scheduler (aurora). The framework authentication fail due to time out:

      error on mesos master:

      I1104 19:37:17.741449  8329 master.cpp:3874] Authenticating scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083
      I1104 19:37:17.741585  8329 master.cpp:3885] Using default CRAM-MD5 authenticator
      I1104 19:37:17.742106  8336 authenticator.hpp:169] Creating new server SASL connection
      W1104 19:37:22.742959  8329 master.cpp:3953] Authentication timed out
      W1104 19:37:22.743548  8329 master.cpp:3930] Failed to authenticate scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083: Authentication discarded
      

      scheduler error:

      I1104 19:38:57.885486 49012 sched.cpp:283] Authenticating with master master@MASTER_IP:PORT
      I1104 19:38:57.885928 49002 authenticatee.hpp:133] Creating new client SASL connection
      I1104 19:38:57.890581 49007 authenticatee.hpp:224] Received SASL authentication mechanisms: CRAM-MD5
      I1104 19:38:57.890656 49007 authenticatee.hpp:250] Attempting to authenticate with mechanism 'CRAM-MD5'
      W1104 19:39:02.891196 49005 sched.cpp:378] Authentication timed out
      I1104 19:39:02.891850 49018 sched.cpp:338] Failed to authenticate with master master@MASTER_IP:PORT: Authentication discarded
      

      Looks like 2 instances scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94 & scheduler-d2d4437b-d375-4467-a583-362152fe065a of same framework is trying to authenticate and fail.

      W1104 19:36:30.769420  8319 master.cpp:3930] Failed to authenticate scheduler-20f88a53-5945-4977-b5af-28f6c52d3c94@SCHEDULER_IP:8083: Failed to communicate with authenticatee
      I1104 19:36:42.701441  8328 master.cpp:3860] Queuing up authentication request from scheduler-d2d4437b-d375-4467-a583-362152fe065a@SCHEDULER_IP:8083 because authentication is still in progress
      

      Restarting master and scheduler didn't fix it.

      This particular issue happen with 1 master and 1 scheduler after MESOS-1866 is fixed.

      Attachments

        1. slave.log
          12 kB
          Kevin Cox
        2. master.log
          3 kB
          Kevin Cox
        3. aurora-scheduler.20141104-1606-1706.log
          424 kB
          Bhuvaneswaran A
        4. mesos-master.20141104-1606-1706.log
          384 kB
          Bhuvaneswaran A

        Activity

          People

            bbannier Benjamin Bannier
            bhuvan Bhuvaneswaran A
            Adam B Adam B
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: