Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1972

Queries that take a long time to plan can cause webserver to block other queries

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0
    • Impala 2.9.0
    • Backend

    Description

      Summary

      Trying to get the details of a query through the debug web page while the query is planning will block other queries (and the UI itself), because query_exec_state_map_lock_ will be held for the duration of planning.

      Details

      While a query is planning, it holds onto its query exec state's lock:

         lock_guard<mutex> l(*(*exec_state)->lock());
      
          // register exec state as early as possible so that queries that
          // take a long time to plan show up, and to handle incoming status
          // reports before execution starts.
          RETURN_IF_ERROR(RegisterQuery(session_state, *exec_state));
          *registered_exec_state = true;
      
          // GetExecRequest() does planning
          RETURN_IF_ERROR((*exec_state)->UpdateQueryStatus(
              exec_env_->frontend()->GetExecRequest(query_ctx, &result)));
      

      Query details callback

      ImpalaServer::QuerySummaryCallback, which handles /query_plan, tries to get the same exec state's lock (see true argument to GetQueryExecState().

      shared_ptr<QueryExecState> exec_state = GetQueryExecState(query_id, true);
      

      GetQueryExecState() holds query_exec_state_map_lock_ while it waits to get the QueryExecState's lock:

      shared_ptr<ImpalaServer::QueryExecState> ImpalaServer::GetQueryExecState(
          const TUniqueId& query_id, bool lock) {
        lock_guard<mutex> l(query_exec_state_map_lock_);
        QueryExecStateMap::iterator i = query_exec_state_map_.find(query_id);
        if (i == query_exec_state_map_.end()) {
          return shared_ptr<QueryExecState>();
        } else {
          if (lock) i->second->lock()->lock();
          return i->second;
        }
      }
      

      So until planning is finished, no query can get query_exec_state_map_lock_, which it needs to execute.

      What can we do?

      In the short term, maybe we can add TryGetQueryExecState() which will indicate if the query exists but its lock can't be taken.

      Or we might be able to let go of query_exec_state_map_lock_ as soon as we find the entry, and before taking its lock:

      shared_ptr<ImpalaServer::QueryExecState> ImpalaServer::GetQueryExecState(
          const TUniqueId& query_id, bool lock) {
        shared_ptr<QueryExecState> ret;
        {
          lock_guard<mutex> l(query_exec_state_map_lock_);
          QueryExecStateMap::iterator i = query_exec_state_map_.find(query_id);
          if (i == query_exec_state_map_.end()) {
            return shared_ptr<QueryExecState>();
          } else {
            ret = i->second;
          }
        } // give up query_exec_state_map_lock_
      
        if (lock) ret->lock()->lock();
        return ret;
      }
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bharathv Bharath Vissapragada
            henryr Henry Robinson
            Votes:
            1 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment