Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-1455

Segfault in libprocess during Process linking.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.19.0
    • 0.19.0
    • libprocess
    • None

    Description

      Here is a backtrace:

      ======= Backtrace: =========
      /lib64/libc.so.6[0x7f916acc274f]
      /lib64/libc.so.6(cfree+0x4b)[0x7f916acc6a4b]
      /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(_ZN7process17receiving_connectEP7ev_loopP5ev_ioi+0xc5)[0x7f9146a64d55]
      /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(ev_invoke_pending+0x55)[0x7f9146b65105]
      /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(ev_run+0x937)[0x7f9146b680b7]
      /usr/local/lib64/libmesos-0.19.0-tw6_rc1.so(_ZN7process5serveEPv+0xb)[0x7f9146a4c1cb]
      /lib64/libpthread.so.0[0x7f916b3c283d]
      /lib64/libc.so.6(clone+0x6d)[0x7f916ad2626d]
      

      The bug was introduced as we added support for pure language bindings communicating with libprocess:

      see XXX comments
      @@ -1930,13 +1991,13 @@ void SocketManager::link(ProcessBase* process, const UPID& to)
      
             persists[node] = s;
      
      -      // Allocate and initialize the decoder and watcher (we really
      -      // only "receive" on this socket so that we can react when it
      -      // gets closed and generate appropriate lost events).
      -      DataDecoder* decoder = new DataDecoder(sockets[s]);
      -
      +      // Allocate and initialize a watcher for reading data from this
      +      // socket. Note that we don't expect to receive anything other
      +      // than HTTP '202 Accepted' responses which we anyway ignore.
      +      // We do, however, want to react when it gets closed so we can
      +      // generate appropriate lost events (since this is a 'link').
             ev_io* watcher = new ev_io();
      -      watcher->data = decoder;
      +      watcher->data = new Socket(sockets[s]); // XXX receiving_connect expects watcher->data to be a Decoder* !!!
      
            // Try and connect to the node using this socket.
            sockaddr_in addr;
            memset(&addr, 0, sizeof(addr));
            addr.sin_family = PF_INET;
            addr.sin_port = htons(to.port);
            addr.sin_addr.s_addr = to.ip;
      
            if (connect(s, (sockaddr*) &addr, sizeof(addr)) < 0) {
              if (errno != EINPROGRESS) {
                PLOG(FATAL) << "Failed to link, connect";
              }
      
              // Wait for socket to be connected.
              ev_io_init(watcher, receiving_connect, s, EV_WRITE); // XXX: watcher->data is a Socket*, not a Decoder*!
            } else {
              ev_io_init(watcher, ignore_data, s, EV_READ);
            }
      
      receiving_connect expects Decoder*
      void receiving_connect(struct ev_loop* loop, ev_io* watcher, int revents)
      {
        int s = watcher->fd;
      
        // Now check that a successful connection was made.
        int opt;
        socklen_t optlen = sizeof(opt);
      
        if (getsockopt(s, SOL_SOCKET, SO_ERROR, &opt, &optlen) < 0 || opt != 0) {
          // Connect failure.
          VLOG(1) << "Socket error while connecting";
          socket_manager->close(s);
          DataDecoder* decoder = (DataDecoder*) watcher->data; // XXX A Socket* in the case above !!
          delete decoder;
          ev_io_stop(loop, watcher);
          delete watcher;
        } else {
          // We're connected! Now let's do some receiving.
          ev_io_stop(loop, watcher);
          ev_io_init(watcher, ignore_data, s, EV_READ);
          ev_io_start(loop, watcher);
        }
      }
      

      Attachments

        Issue Links

          Activity

            People

              bmahler Benjamin Mahler
              bmahler Benjamin Mahler
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: