Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
0.19.0
-
None
Description
Here is a backtrace:
======= Backtrace: ========= /lib64/libc.so.6[0x7f916acc274f] /lib64/libc.so.6(cfree+0x4b)[0x7f916acc6a4b] /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(_ZN7process17receiving_connectEP7ev_loopP5ev_ioi+0xc5)[0x7f9146a64d55] /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(ev_invoke_pending+0x55)[0x7f9146b65105] /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(ev_run+0x937)[0x7f9146b680b7] /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(_ZN7process5serveEPv+0xb)[0x7f9146a4c1cb] /lib64/libpthread.so.0[0x7f916b3c283d] /lib64/libc.so.6(clone+0x6d)[0x7f916ad2626d]
The bug was introduced as we added support for pure language bindings communicating with libprocess:
see XXX comments
@@ -1930,13 +1991,13 @@ void SocketManager::link(ProcessBase* process, const UPID& to) persists[node] = s; - // Allocate and initialize the decoder and watcher (we really - // only "receive" on this socket so that we can react when it - // gets closed and generate appropriate lost events). - DataDecoder* decoder = new DataDecoder(sockets[s]); - + // Allocate and initialize a watcher for reading data from this + // socket. Note that we don't expect to receive anything other + // than HTTP '202 Accepted' responses which we anyway ignore. + // We do, however, want to react when it gets closed so we can + // generate appropriate lost events (since this is a 'link'). ev_io* watcher = new ev_io(); - watcher->data = decoder; + watcher->data = new Socket(sockets[s]); // XXX receiving_connect expects watcher->data to be a Decoder* !!! // Try and connect to the node using this socket. sockaddr_in addr; memset(&addr, 0, sizeof(addr)); addr.sin_family = PF_INET; addr.sin_port = htons(to.port); addr.sin_addr.s_addr = to.ip; if (connect(s, (sockaddr*) &addr, sizeof(addr)) < 0) { if (errno != EINPROGRESS) { PLOG(FATAL) << "Failed to link, connect"; } // Wait for socket to be connected. ev_io_init(watcher, receiving_connect, s, EV_WRITE); // XXX: watcher->data is a Socket*, not a Decoder*! } else { ev_io_init(watcher, ignore_data, s, EV_READ); }
receiving_connect expects Decoder*
void receiving_connect(struct ev_loop* loop, ev_io* watcher, int revents) { int s = watcher->fd; // Now check that a successful connection was made. int opt; socklen_t optlen = sizeof(opt); if (getsockopt(s, SOL_SOCKET, SO_ERROR, &opt, &optlen) < 0 || opt != 0) { // Connect failure. VLOG(1) << "Socket error while connecting"; socket_manager->close(s); DataDecoder* decoder = (DataDecoder*) watcher->data; // XXX A Socket* in the case above !! delete decoder; ev_io_stop(loop, watcher); delete watcher; } else { // We're connected! Now let's do some receiving. ev_io_stop(loop, watcher); ev_io_init(watcher, ignore_data, s, EV_READ); ev_io_start(loop, watcher); } }
Attachments
Issue Links
- is duplicated by
-
MESOS-1430 SIGSEGV in ~DataDecoder on master branch
- Resolved