Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4647

C++ client is not propagating a connection failed error when a drillbit goes down

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.8.0
    • None
    • None

    Description

      When a drillbit goes down, there are two conditions under which the client is not propagating the error back to the application -
      1) The application is in a submitQuery call: the ODBC driver is expecting that the error be reported thru the query results listener which hasn't been registered at the point the error is encountered.
      2) A submitQuery call succeeded but never reached the drillbit because it was shutdown. In this case the application has a handle to a query and is listening for results which will never arrive. The heartbeat mechanism detects the failure, but is not propagating the error to the query results listener.

      Attachments

        Issue Links

          Activity

            githubbot ASF GitHub Bot added a comment -

            GitHub user parthchandra opened a pull request:

            https://github.com/apache/drill/pull/493

            DRILL-4647: C++ client fails to propagate a dead connection error to …

            …the application.

            Return correct status from the DrillClient API if query submission fails.
            Also, propagate error back to the application thru the listener if query submission fails or if there is a heartbeat failure before the response to query submission is received.

            You can merge this pull request into a Git repository by running:

            $ git pull https://github.com/parthchandra/drill DRILL-4647

            Alternatively you can review and apply these changes as the patch at:

            https://github.com/apache/drill/pull/493.patch

            To close this pull request, make a commit to your master/trunk branch
            with (at least) the following in the commit message:

            This closes #493


            commit e39f263eb18fb52d4105dfc8d3131357ca0a19a9
            Author: Parth Chandra <parthc@apache.org>
            Date: 2016-04-30T00:38:44Z

            DRILL-4647: C++ client fails to propagate a dead connection error to the application


            githubbot ASF GitHub Bot added a comment - GitHub user parthchandra opened a pull request: https://github.com/apache/drill/pull/493 DRILL-4647 : C++ client fails to propagate a dead connection error to … …the application. Return correct status from the DrillClient API if query submission fails. Also, propagate error back to the application thru the listener if query submission fails or if there is a heartbeat failure before the response to query submission is received. You can merge this pull request into a Git repository by running: $ git pull https://github.com/parthchandra/drill DRILL-4647 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/493.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #493 commit e39f263eb18fb52d4105dfc8d3131357ca0a19a9 Author: Parth Chandra <parthc@apache.org> Date: 2016-04-30T00:38:44Z DRILL-4647 : C++ client fails to propagate a dead connection error to the application
            githubbot ASF GitHub Bot added a comment -

            Github user sudheeshkatkam commented on a diff in the pull request:

            https://github.com/apache/drill/pull/493#discussion_r69653159

            — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp —
            @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t

            uint64_t coordId;
            DrillClientQueryResult* pQuery=NULL;
            + connectionStatus_t cStatus=CONN_SUCCESS;
            {
            boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
            boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
            coordId = this->getNextCoordinationId();
            OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);

            • sendSync(out_msg);

            + // Create the result object and register the listener before we send the query
            + // because sometimes the caller is not checking the status of the submitQuery call.
            + // This way, the broadcast error call will cause the results listener to be called
            + // with a COMM_ERROR status.
            pQuery = new DrillClientQueryResult(this, coordId, plan);
            pQuery->registerListener(l, lCtx);

            • bool sendRequest=false;
              this->m_queryIds[coordId]=pQuery;
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              + connectionStatus_t cStatus=sendSync(out_msg);
              + if(cStatus == CONN_SUCCESS){
              + bool sendRequest=false;
            • if(m_pendingRequests++==0) { - sendRequest=true; - }

              else

              { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - }
            • if(sendRequest){
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
            • << m_pendingRequests << std::endl
            • getNextResult(); // async wait for results
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              +
              + if(m_pendingRequests++==0) { + sendRequest=true; + }

              else

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + }

              + if(sendRequest)

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + }

              }
              +
              + }
              + if(cStatus!=CONN_SUCCESS){

                • End diff –

            This block fails only this query. How are the other queries notified? Heartbeat failure?

            githubbot ASF GitHub Bot added a comment - Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/493#discussion_r69653159 — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp — @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t uint64_t coordId; DrillClientQueryResult* pQuery=NULL; + connectionStatus_t cStatus=CONN_SUCCESS; { boost::lock_guard<boost::mutex> prLock(this->m_prMutex); boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex); coordId = this->getNextCoordinationId(); OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query); sendSync(out_msg); + // Create the result object and register the listener before we send the query + // because sometimes the caller is not checking the status of the submitQuery call. + // This way, the broadcast error call will cause the results listener to be called + // with a COMM_ERROR status. pQuery = new DrillClientQueryResult(this, coordId, plan); pQuery->registerListener(l, lCtx); bool sendRequest=false; this->m_queryIds [coordId] =pQuery; DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + connectionStatus_t cStatus=sendSync(out_msg); + if(cStatus == CONN_SUCCESS){ + bool sendRequest=false; if(m_pendingRequests++==0) { - sendRequest=true; - } else { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - } if(sendRequest){ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " << m_pendingRequests << std::endl getNextResult(); // async wait for results + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + + if(m_pendingRequests++==0) { + sendRequest=true; + } else { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + } + if(sendRequest) { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + } } + + } + if(cStatus!=CONN_SUCCESS){ End diff – This block fails only this query. How are the other queries notified? Heartbeat failure?
            githubbot ASF GitHub Bot added a comment -

            Github user sudheeshkatkam commented on a diff in the pull request:

            https://github.com/apache/drill/pull/493#discussion_r69653162

            — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp —
            @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t

            uint64_t coordId;
            DrillClientQueryResult* pQuery=NULL;
            + connectionStatus_t cStatus=CONN_SUCCESS;
            {
            boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
            boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
            coordId = this->getNextCoordinationId();
            OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);

            • sendSync(out_msg);

            + // Create the result object and register the listener before we send the query
            + // because sometimes the caller is not checking the status of the submitQuery call.
            + // This way, the broadcast error call will cause the results listener to be called
            + // with a COMM_ERROR status.
            pQuery = new DrillClientQueryResult(this, coordId, plan);
            pQuery->registerListener(l, lCtx);

            • bool sendRequest=false;
              this->m_queryIds[coordId]=pQuery;
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              + connectionStatus_t cStatus=sendSync(out_msg);
              + if(cStatus == CONN_SUCCESS){
              + bool sendRequest=false;
            • if(m_pendingRequests++==0) { - sendRequest=true; - }

              else

              { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - }
            • if(sendRequest){
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
            • << m_pendingRequests << std::endl
            • getNextResult(); // async wait for results
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
                • End diff –

            Decrease indent level from 494-501.

            githubbot ASF GitHub Bot added a comment - Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/493#discussion_r69653162 — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp — @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t uint64_t coordId; DrillClientQueryResult* pQuery=NULL; + connectionStatus_t cStatus=CONN_SUCCESS; { boost::lock_guard<boost::mutex> prLock(this->m_prMutex); boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex); coordId = this->getNextCoordinationId(); OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query); sendSync(out_msg); + // Create the result object and register the listener before we send the query + // because sometimes the caller is not checking the status of the submitQuery call. + // This way, the broadcast error call will cause the results listener to be called + // with a COMM_ERROR status. pQuery = new DrillClientQueryResult(this, coordId, plan); pQuery->registerListener(l, lCtx); bool sendRequest=false; this->m_queryIds [coordId] =pQuery; DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + connectionStatus_t cStatus=sendSync(out_msg); + if(cStatus == CONN_SUCCESS){ + bool sendRequest=false; if(m_pendingRequests++==0) { - sendRequest=true; - } else { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - } if(sendRequest){ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " << m_pendingRequests << std::endl getNextResult(); // async wait for results + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl End diff – Decrease indent level from 494-501.
            githubbot ASF GitHub Bot added a comment -

            Github user parthchandra commented on a diff in the pull request:

            https://github.com/apache/drill/pull/493#discussion_r69666212

            — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp —
            @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t

            uint64_t coordId;
            DrillClientQueryResult* pQuery=NULL;
            + connectionStatus_t cStatus=CONN_SUCCESS;
            {
            boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
            boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
            coordId = this->getNextCoordinationId();
            OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);

            • sendSync(out_msg);

            + // Create the result object and register the listener before we send the query
            + // because sometimes the caller is not checking the status of the submitQuery call.
            + // This way, the broadcast error call will cause the results listener to be called
            + // with a COMM_ERROR status.
            pQuery = new DrillClientQueryResult(this, coordId, plan);
            pQuery->registerListener(l, lCtx);

            • bool sendRequest=false;
              this->m_queryIds[coordId]=pQuery;
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              + connectionStatus_t cStatus=sendSync(out_msg);
              + if(cStatus == CONN_SUCCESS){
              + bool sendRequest=false;
            • if(m_pendingRequests++==0) { - sendRequest=true; - }

              else

              { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - }
            • if(sendRequest){
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
            • << m_pendingRequests << std::endl
            • getNextResult(); // async wait for results
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              +
              + if(m_pendingRequests++==0) { + sendRequest=true; + }

              else

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + }

              + if(sendRequest)

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + }

              }
              +
              + }
              + if(cStatus!=CONN_SUCCESS){

                • End diff –

            No. If there is a failure in sending, then the sendSync calls handleConnError which fails all the other pending queries (it call broadcastError).
            The block above simple removes the queryId for this query from the list of queries currently executing.

            githubbot ASF GitHub Bot added a comment - Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/493#discussion_r69666212 — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp — @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t uint64_t coordId; DrillClientQueryResult* pQuery=NULL; + connectionStatus_t cStatus=CONN_SUCCESS; { boost::lock_guard<boost::mutex> prLock(this->m_prMutex); boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex); coordId = this->getNextCoordinationId(); OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query); sendSync(out_msg); + // Create the result object and register the listener before we send the query + // because sometimes the caller is not checking the status of the submitQuery call. + // This way, the broadcast error call will cause the results listener to be called + // with a COMM_ERROR status. pQuery = new DrillClientQueryResult(this, coordId, plan); pQuery->registerListener(l, lCtx); bool sendRequest=false; this->m_queryIds [coordId] =pQuery; DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + connectionStatus_t cStatus=sendSync(out_msg); + if(cStatus == CONN_SUCCESS){ + bool sendRequest=false; if(m_pendingRequests++==0) { - sendRequest=true; - } else { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - } if(sendRequest){ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " << m_pendingRequests << std::endl getNextResult(); // async wait for results + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + + if(m_pendingRequests++==0) { + sendRequest=true; + } else { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + } + if(sendRequest) { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + } } + + } + if(cStatus!=CONN_SUCCESS){ End diff – No. If there is a failure in sending, then the sendSync calls handleConnError which fails all the other pending queries (it call broadcastError). The block above simple removes the queryId for this query from the list of queries currently executing.
            githubbot ASF GitHub Bot added a comment -

            Github user parthchandra commented on a diff in the pull request:

            https://github.com/apache/drill/pull/493#discussion_r69666595

            — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp —
            @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t

            uint64_t coordId;
            DrillClientQueryResult* pQuery=NULL;
            + connectionStatus_t cStatus=CONN_SUCCESS;
            {
            boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
            boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
            coordId = this->getNextCoordinationId();
            OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);

            • sendSync(out_msg);

            + // Create the result object and register the listener before we send the query
            + // because sometimes the caller is not checking the status of the submitQuery call.
            + // This way, the broadcast error call will cause the results listener to be called
            + // with a COMM_ERROR status.
            pQuery = new DrillClientQueryResult(this, coordId, plan);
            pQuery->registerListener(l, lCtx);

            • bool sendRequest=false;
              this->m_queryIds[coordId]=pQuery;
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              + connectionStatus_t cStatus=sendSync(out_msg);
              + if(cStatus == CONN_SUCCESS){
              + bool sendRequest=false;
            • if(m_pendingRequests++==0) { - sendRequest=true; - }

              else

              { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - }
            • if(sendRequest){
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
            • << m_pendingRequests << std::endl
            • getNextResult(); // async wait for results
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
                • End diff –

            Hmm. Vim's code formatter got confused by the fact that the previous line of code had no terminal semi-colon.
            Fixed it manually.

            githubbot ASF GitHub Bot added a comment - Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/493#discussion_r69666595 — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp — @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t uint64_t coordId; DrillClientQueryResult* pQuery=NULL; + connectionStatus_t cStatus=CONN_SUCCESS; { boost::lock_guard<boost::mutex> prLock(this->m_prMutex); boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex); coordId = this->getNextCoordinationId(); OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query); sendSync(out_msg); + // Create the result object and register the listener before we send the query + // because sometimes the caller is not checking the status of the submitQuery call. + // This way, the broadcast error call will cause the results listener to be called + // with a COMM_ERROR status. pQuery = new DrillClientQueryResult(this, coordId, plan); pQuery->registerListener(l, lCtx); bool sendRequest=false; this->m_queryIds [coordId] =pQuery; DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + connectionStatus_t cStatus=sendSync(out_msg); + if(cStatus == CONN_SUCCESS){ + bool sendRequest=false; if(m_pendingRequests++==0) { - sendRequest=true; - } else { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - } if(sendRequest){ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " << m_pendingRequests << std::endl getNextResult(); // async wait for results + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl End diff – Hmm. Vim's code formatter got confused by the fact that the previous line of code had no terminal semi-colon. Fixed it manually.
            githubbot ASF GitHub Bot added a comment -

            Github user parthchandra commented on a diff in the pull request:

            https://github.com/apache/drill/pull/493#discussion_r69666695

            — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp —
            @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t

            uint64_t coordId;
            DrillClientQueryResult* pQuery=NULL;
            + connectionStatus_t cStatus=CONN_SUCCESS;
            {
            boost::lock_guard<boost::mutex> prLock(this->m_prMutex);
            boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex);
            coordId = this->getNextCoordinationId();
            OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query);

            • sendSync(out_msg);

            + // Create the result object and register the listener before we send the query
            + // because sometimes the caller is not checking the status of the submitQuery call.
            + // This way, the broadcast error call will cause the results listener to be called
            + // with a COMM_ERROR status.
            pQuery = new DrillClientQueryResult(this, coordId, plan);
            pQuery->registerListener(l, lCtx);

            • bool sendRequest=false;
              this->m_queryIds[coordId]=pQuery;
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              + connectionStatus_t cStatus=sendSync(out_msg);
              + if(cStatus == CONN_SUCCESS){
              + bool sendRequest=false;
            • if(m_pendingRequests++==0) { - sendRequest=true; - }

              else

              { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - }
            • if(sendRequest){
            • DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = "
            • << m_pendingRequests << std::endl
            • getNextResult(); // async wait for results
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << "[" << m_connectedHost << "]" << "Coordination id = " << coordId << std::endl
              + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl
              +
              + if(m_pendingRequests++==0) { + sendRequest=true; + }

              else

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + }

              + if(sendRequest)

              { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + }

              }
              +
              + }
              + if(cStatus!=CONN_SUCCESS){

                • End diff –

            Though you are correct in once respect. The only reliable way to know if a connection is gone is the heartbeat. So a heartbeat failure will eventually occur and will guarantee that any pending queries fail in case of a network failure.

            githubbot ASF GitHub Bot added a comment - Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/493#discussion_r69666695 — Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp — @@ -469,34 +471,50 @@ DrillClientQueryResult* DrillClientImpl::SubmitQuery(::exec::shared::QueryType t uint64_t coordId; DrillClientQueryResult* pQuery=NULL; + connectionStatus_t cStatus=CONN_SUCCESS; { boost::lock_guard<boost::mutex> prLock(this->m_prMutex); boost::lock_guard<boost::mutex> dcLock(this->m_dcMutex); coordId = this->getNextCoordinationId(); OutBoundRpcMessage out_msg(exec::rpc::REQUEST, exec::user::RUN_QUERY, coordId, &query); sendSync(out_msg); + // Create the result object and register the listener before we send the query + // because sometimes the caller is not checking the status of the submitQuery call. + // This way, the broadcast error call will cause the results listener to be called + // with a COMM_ERROR status. pQuery = new DrillClientQueryResult(this, coordId, plan); pQuery->registerListener(l, lCtx); bool sendRequest=false; this->m_queryIds [coordId] =pQuery; DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + connectionStatus_t cStatus=sendSync(out_msg); + if(cStatus == CONN_SUCCESS){ + bool sendRequest=false; if(m_pendingRequests++==0) { - sendRequest=true; - } else { - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) - DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) - } if(sendRequest){ DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " << m_pendingRequests << std::endl getNextResult(); // async wait for results + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query request. " << " [" << m_connectedHost << "] " << "Coordination id = " << coordId << std::endl + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sent query " << "Coordination id = " << coordId << " query: " << plan << std::endl + + if(m_pendingRequests++==0) { + sendRequest=true; + } else { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Queueing query request to server" << std::endl;) + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Number of pending requests = " << m_pendingRequests << std::endl;) + } + if(sendRequest) { + DRILL_MT_LOG(DRILL_LOG(LOG_DEBUG) << "Sending query request. Number of pending requests = " + << m_pendingRequests << std::endl;) + getNextResult(); // async wait for results + } } + + } + if(cStatus!=CONN_SUCCESS){ End diff – Though you are correct in once respect. The only reliable way to know if a connection is gone is the heartbeat. So a heartbeat failure will eventually occur and will guarantee that any pending queries fail in case of a network failure.
            githubbot ASF GitHub Bot added a comment -

            Github user sudheeshkatkam commented on the issue:

            https://github.com/apache/drill/pull/493

            +1

            githubbot ASF GitHub Bot added a comment - Github user sudheeshkatkam commented on the issue: https://github.com/apache/drill/pull/493 +1
            githubbot ASF GitHub Bot added a comment -

            Github user asfgit closed the pull request at:

            https://github.com/apache/drill/pull/493

            githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/493

            People

              Unassigned Unassigned
              parthc Parth Chandra
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: