Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4135

Thrift threaded server times-out waiting connections during high load

    Details

    • Docs Text:
      Hide
      This jira fixes a tcp connection scalability issue w.r.t highly concurrent workloads on an Impala cluster. It improves the connection accept rate under stress by accepting them asap and qeueing them for further processing. By unblocking accept(), the connection requesting client doesn't timeout. It adds a new startup param --accepted_cnxn_queue_depth that controls the accept queue depth and defaults to 10000. Based on our stress testing on a 250 node cluster, for clusters of that scale, this should be increased to ~100000
      Show
      This jira fixes a tcp connection scalability issue w.r.t highly concurrent workloads on an Impala cluster. It improves the connection accept rate under stress by accepting them asap and qeueing them for further processing. By unblocking accept(), the connection requesting client doesn't timeout. It adds a new startup param --accepted_cnxn_queue_depth that controls the accept queue depth and defaults to 10000. Based on our stress testing on a 250 node cluster, for clusters of that scale, this should be increased to ~100000

      Description

      During times of high load, Thrift's TThreadedServer can't keep up with the rate of new socket connections.

      Here's a repro:

      ThreadPool<int64_t> pool("group", "test", 128, 10000, [](int tid, const int64_t& item) {
              using Client = ThriftClient<ImpalaInternalServiceClient>;
              Client* client = new Client("127.0.0.1", 22000, "", NULL, false);
              Status status = client->Open();
              if (status.ok()) {
                LOG(INFO) << "Socket " << item << " -> OK";
              } else {
                LOG(INFO) << "Socket " << item << " -> Failed: " << status.GetDetail();
              }
            });
        for (int i = 0; i < 1024 * 16; ++i) pool.Offer(i);
        pool.DrainAndShutdown();
      

      Somewhere from 5-50 connections fail on my machine with "connect(): timed out" error messages. This happens when the socket sits in the server's accept queue for too long.

      The server runs accept() in a single thread, and then does all the work of starting that server-side handler in the same thread, which includes creating a new thread, taking a lock, creating transports and so on.

      The important thing is to move sockets from waiting-for-accept to the accepted state. It's ok if there's some delay between being accepted, and the connection being completely set up. So the easiest thing to do is to add a small thread pool to the TThreadedServer which handles every aspect of connection set-up except for accept() itself - leaving the main server thread to accept() and Offer() and nothing else. Even if the thread pool has one thread, there's a benefit if the thread pool's queue is large enough to buffer spikes in connection requests.

      With a prototype in place, and one thread in the thread pool, it took ~35s to accept 16k connections, a rate of ~200 accepts per second. Without the patch, that rate drops to ~60 per second. I'm not sure what limits the thread pool solutions - maybe the queue gets full or maybe the test driver that's opening the connections has a limit.

      This will be fixed longer term by IMPALA-2567, but we can reduce the pain this causes for larger clusters with more complex queries here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                twmarshall Thomas Tauber-Marshall
                Reporter:
                henryr Henry Robinson
              • Votes:
                2 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: