Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-653

Slave gets wrong id and commits suicide

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.14.0
    • Fix Version/s: None
    • Component/s: agent
    • Labels:
    • Environment:

      Debian 6, mesos git rev: 853c9ba

      Description

      After restart slave can't re-register and crashes.

      I0821 11:48:08.825433 24723 slave.cpp:113] Slave started on 1)@127.0.0.1:5051
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@658: Client environment:zookeeper.version=zookeeper C client 3.3.4
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@662: Client environment:host.name=192.168.1.1
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@669: Client environment:os.name=Linux
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@670: Client environment:os.arch=2.6.32-5-amd64
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@671: Client environment:os.version=#1 SMP Fri May 10 08:43:19 UTC 2013
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@679: Client environment:user.name=(null)
      I0821 11:48:08.825728 24723 slave.cpp:213] Slave resources: cpus:4; mem:6866; disk:10288; ports:[31000-32000]
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@687: Client environment:user.home=/root
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@log_env@699: Client environment:user.dir=/
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_INFO@zookeeper_init@727: Initiating client connection, host=192.168.1.1:2181 sessionTimeout=10000 watcher=0x7f18e0086aa0 sessionId=0 sessionPasswd=<null> context=0x7f18d4004d50 flags=0
      2013-08-21 11:48:08,825:24712(0x7f18da82a700):ZOO_DEBUG@start_threads@152: starting threads...
      2013-08-21 11:48:08,826:24712(0x7f18d3df4700):ZOO_DEBUG@do_io@279: started IO thread
      2013-08-21 11:48:08,826:24712(0x7f18d35f3700):ZOO_DEBUG@do_completion@326: started completion thread
      2013-08-21 11:48:08,826:24712(0x7f18d3df4700):ZOO_INFO@check_events@1585: initiated connection to server [192.168.1.1:2181]
      I0821 11:48:08.826833 24722 process_isolator.cpp:315] Recovering isolator
      I0821 11:48:08.826894 24724 slave.cpp:403] Finished recovery
      I0821 11:48:08.826939 24724 slave.cpp:423] Garbage collecting old slave 201308151322-2380865683-5050-29275-0
      I0821 11:48:08.826990 24723 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201308151322-2380865683-5050-29275-0' for removal
      I0821 11:48:08.827003 24724 slave.cpp:423] Garbage collecting old slave 201308191529-2380865683-5050-11189-0
      I0821 11:48:08.827046 24723 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201308191529-2380865683-5050-11189-0' for removal
      2013-08-21 11:48:08,838:24712(0x7f18d3df4700):ZOO_INFO@check_events@1632: session establishment complete on server [192.168.1.1:2181], sessionId=0x14076ce74240387, negotiated timeout=10000
      2013-08-21 11:48:08,838:24712(0x7f18d3df4700):ZOO_DEBUG@check_events@1638: Calling a watcher for a ZOO_SESSION_EVENT and the state=ZOO_CONNECTED_STATE
      2013-08-21 11:48:08,838:24712(0x7f18d35f3700):ZOO_DEBUG@process_completions@1765: Calling a watcher for node [], type = -1 event=ZOO_SESSION_EVENT
      I0821 11:48:08.838765 24723 detector.cpp:234] Master detector (slave(1)@127.0.0.1:5051) connected to ZooKeeper ...
      I0821 11:48:08.838783 24723 detector.cpp:251] Trying to create path '/mesos' in ZooKeeper
      2013-08-21 11:48:08,838:24712(0x7f18da029700):ZOO_DEBUG@zoo_awexists@2587: Sending request xid=0x52148cd9 for path [/mesos] to 192.168.1.1:2181
      2013-08-21 11:48:08,839:24712(0x7f18d3df4700):ZOO_DEBUG@zookeeper_process@1989: Queueing asynchronous response
      2013-08-21 11:48:08,839:24712(0x7f18d35f3700):ZOO_DEBUG@process_completions@1784: Calling COMPLETION_STAT for xid=0x52148cd9 rc=0
      2013-08-21 11:48:08,839:24712(0x7f18da029700):ZOO_DEBUG@zoo_awget_children_@2626: Sending request xid=0x52148cda for path [/mesos] to 192.168.1.1:2181
      2013-08-21 11:48:08,839:24712(0x7f18d3df4700):ZOO_DEBUG@zookeeper_process@1989: Queueing asynchronous response
      2013-08-21 11:48:08,839:24712(0x7f18d35f3700):ZOO_DEBUG@process_completions@1795: Calling COMPLETION_STRINGLIST for xid=0x52148cda rc=0
      I0821 11:48:08.839437 24723 detector.cpp:420] Master detector (slave(1)@127.0.0.1:5051) found 1 registered masters
      2013-08-21 11:48:08,839:24712(0x7f18da029700):ZOO_DEBUG@zoo_awget@2414: Sending request xid=0x52148cdb for path [/mesos/0000000006] to 192.168.1.1:2181
      2013-08-21 11:48:08,839:24712(0x7f18d3df4700):ZOO_DEBUG@zookeeper_process@1989: Queueing asynchronous response
      2013-08-21 11:48:08,839:24712(0x7f18d35f3700):ZOO_DEBUG@process_completions@1772: Calling COMPLETION_DATA for xid=0x52148cdb rc=0
      I0821 11:48:08.839607 24723 detector.cpp:467] Master detector (slave(1)@127.0.0.1:5051) got new master pid: master@192.168.1.1:5050
      I0821 11:48:08.839712 24724 slave.cpp:542] New master detected at master@192.168.1.1:5050
      I0821 11:48:08.839778 24723 status_update_manager.cpp:157] New master detected at master@192.168.1.1:5050
      I0821 11:48:08.840149 24721 slave.cpp:602] Registered with master master@192.168.1.1:5050; given slave ID 201308191529-2380865683-5050-11189-4621
      Registered but got wrong id: 201308191529-2380865683-5050-11189-4622(expected: 201308191529-2380865683-5050-11189-4621). Committing suicide

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              deric Tomas Barton
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: