Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5659

C++ Client (master) behavior is unstable resulting incorrect result or exception in API calls

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.11.0
    • Fix Version/s: 1.11.0
    • Component/s: None
    • Labels:

      Description

      I recently compiled the C++ client (on windows) from master and tested against a 1.9.0 drillbit. The client's behavior does not meet the stable release requirement.

      Some API functionalities are broken and should be investigated.

      Most noticeable is the getColumns(...) call. It will throw an exception with "Cannot decode getcolumns results" when the number of rows (column records) exceeds a certain number.

      I also noticed that: during query execution + data retrieval, if the table is large enough, the result coming back will start to become NULL or empty.

      I will see if I can generate some drillclient logs to put in the attachment.
      I will also compile and test on other platforms.

        Issue Links

          Activity

          Hide
          robertw Rob Wu added a comment -

          Added some logs generated by query submitter

          Show
          robertw Rob Wu added a comment - Added some logs generated by query submitter
          Hide
          robertw Rob Wu added a comment -

          "1-10cppClient" prefix logs were generated by 1.10.0 branch Drill Client.
          "1-11cppClient" prefix logs were generated by the current master branch (1.11.0 snapshot) Drill Client.
          "dfs" suffix log files were generated by query execution on a dfs file.
          "hive" suffix log files were generated by query execution on a hive table.
          "metadata and catalog test" suffix log files were generated with api=meta with added custom catalog functions test (getCatalogs, getSchemas, getTables, getColumns)

          Show
          robertw Rob Wu added a comment - "1-10cppClient" prefix logs were generated by 1.10.0 branch Drill Client. "1-11cppClient" prefix logs were generated by the current master branch (1.11.0 snapshot) Drill Client. "dfs" suffix log files were generated by query execution on a dfs file. "hive" suffix log files were generated by query execution on a hive table. "metadata and catalog test" suffix log files were generated with api=meta with added custom catalog functions test (getCatalogs, getSchemas, getTables, getColumns)
          Hide
          rhou Robert Hou added a comment -

          Hi Rob,

          Is this a blocking issue for you? How likely is it that a customer will encounter this issue?

          Show
          rhou Robert Hou added a comment - Hi Rob, Is this a blocking issue for you? How likely is it that a customer will encounter this issue?
          Hide
          robertw Rob Wu added a comment -

          Hi Robert,

          This is not a blocking issue atm. But it should definitely be looked at before the 1.11 release so it does not affect the upcoming driver release. This issue only occurs on 1.11 Snapshot C++ client, so 1.10 and below are not affected.

          Rob

          Show
          robertw Rob Wu added a comment - Hi Robert, This is not a blocking issue atm. But it should definitely be looked at before the 1.11 release so it does not affect the upcoming driver release. This issue only occurs on 1.11 Snapshot C++ client, so 1.10 and below are not affected. Rob
          Hide
          robertw Rob Wu added a comment -

          Compiled and tested on Windows and Linux (CentOS-5.9).
          Added 2 more logs

          Show
          robertw Rob Wu added a comment - Compiled and tested on Windows and Linux (CentOS-5.9). Added 2 more logs
          Hide
          laurentgo Laurent Goujon added a comment -

          I think I reproduce the issue with the following query:

          SELECT 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 FROM (VALUES(1))
          

          The JDBC driver doesn't seem to be affected

          Show
          laurentgo Laurent Goujon added a comment - I think I reproduce the issue with the following query: SELECT 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 FROM (VALUES(1)) The JDBC driver doesn't seem to be affected
          Hide
          parthc Parth Chandra added a comment -

          I can also reproduce it reading from the tpch lineitem table, though Laurent's is clearly the smallest reproduction of the issue. FWIW, valgrind shows no corruption or leaks when this issue crops up.

          Show
          parthc Parth Chandra added a comment - I can also reproduce it reading from the tpch lineitem table, though Laurent's is clearly the smallest reproduction of the issue. FWIW, valgrind shows no corruption or leaks when this issue crops up.
          Hide
          parthc Parth Chandra added a comment -

          Definitely caused by the commit ( d11aba2e55323bb5a6a9deb5bb09fd87470dcedf ) for DRILL-4335. Unfortunately, the refactoring changes are too many. I'll spend the rest of the day doing a diff and trying to figure out. Seems easier to do that since the problem is showing up intermittently for me.

          Show
          parthc Parth Chandra added a comment - Definitely caused by the commit ( d11aba2e55323bb5a6a9deb5bb09fd87470dcedf ) for DRILL-4335 . Unfortunately, the refactoring changes are too many. I'll spend the rest of the day doing a diff and trying to figure out. Seems easier to do that since the problem is showing up intermittently for me.
          Hide
          parthc Parth Chandra added a comment -

          So, it looks like this:

          The original code for reading (synchronously) from a socket looked like this:

               while(1){
                  size_t dataBytesRead = m_socket.read_some(boost::asio::buffer(inBuf, bytesToRead),
                                                     errorCode);
                  if(errorCode){
                      break;
                  } 
                 ...
              }
          

          and now it looks like

               while(1){
                  size_t dataBytesRead = m_socket.read_some(boost::asio::buffer(inBuf, bytesToRead),
                                                     errorCode);
                  if(EINTR != errorCode.value()){
                      break;
                  } 
                 ...
              }
          

          The original works correctly and the new code doesn't.

          If

          errorCode.value() == 0

          we break out of the while loop even if we have not managed to read all the data off the socket. This typically will happen when there is a large data set and the server is returning data really, really fast so there is a lot of data in the socket stream.

          Modifying the code to check for errorCode (which is equivalent to checking errorCode.value() != 0) first, makes everyone happy again.

          Will submit a patch after a little more testing.

          Show
          parthc Parth Chandra added a comment - So, it looks like this: The original code for reading (synchronously) from a socket looked like this: while (1){ size_t dataBytesRead = m_socket.read_some(boost::asio::buffer(inBuf, bytesToRead), errorCode); if (errorCode){ break ; } ... } and now it looks like while (1){ size_t dataBytesRead = m_socket.read_some(boost::asio::buffer(inBuf, bytesToRead), errorCode); if (EINTR != errorCode.value()){ break ; } ... } The original works correctly and the new code doesn't. If errorCode.value() == 0 we break out of the while loop even if we have not managed to read all the data off the socket. This typically will happen when there is a large data set and the server is returning data really, really fast so there is a lot of data in the socket stream. Modifying the code to check for errorCode (which is equivalent to checking errorCode.value() != 0) first, makes everyone happy again. Will submit a patch after a little more testing.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user parthchandra opened a pull request:

          https://github.com/apache/drill/pull/876

          DRILL-5659: Fix error code checking in reading from socket

          Check if errorcode is set before checking if the error code is EINTR. Also, check for error first thing after making the read_some or write_some calls.
          This fixes the case when we read (or write) only a part of the data in a call to read from or write to the socket.
          Tested for the issue on Mac/Centos. Not tested on Windows. Not tested for the case with SASL encryption.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/parthchandra/drill DRILL-5659

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/drill/pull/876.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #876


          commit 0a29ced81dc935b926e734ea169eb884a2cbe40f
          Author: Parth Chandra <parthc@apache.org>
          Date: 2017-07-12T21:39:17Z

          DRILL-5659: Fix error code checking in reading from socket


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user parthchandra opened a pull request: https://github.com/apache/drill/pull/876 DRILL-5659 : Fix error code checking in reading from socket Check if errorcode is set before checking if the error code is EINTR. Also, check for error first thing after making the read_some or write_some calls. This fixes the case when we read (or write) only a part of the data in a call to read from or write to the socket. Tested for the issue on Mac/Centos. Not tested on Windows. Not tested for the case with SASL encryption. You can merge this pull request into a Git repository by running: $ git pull https://github.com/parthchandra/drill DRILL-5659 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/876.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #876 commit 0a29ced81dc935b926e734ea169eb884a2cbe40f Author: Parth Chandra <parthc@apache.org> Date: 2017-07-12T21:39:17Z DRILL-5659 : Fix error code checking in reading from socket
          Hide
          parthc Parth Chandra added a comment -

          Sorabh Hamirwasia, Rob Wu, Laurent Goujon can you folks please review and try this fix out (esp with the sasl encrption enabled)?

          Show
          parthc Parth Chandra added a comment - Sorabh Hamirwasia , Rob Wu , Laurent Goujon can you folks please review and try this fix out (esp with the sasl encrption enabled)?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user sohami commented on the issue:

          https://github.com/apache/drill/pull/876

          +1 LGTM
          Thanks a lot for catching and fixing this!. I will test it with SASL encryption case.

          Show
          githubbot ASF GitHub Bot added a comment - Github user sohami commented on the issue: https://github.com/apache/drill/pull/876 +1 LGTM Thanks a lot for catching and fixing this!. I will test it with SASL encryption case.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user superbstreak commented on the issue:

          https://github.com/apache/drill/pull/876

          Thanks, @parthchandra!

          Show
          githubbot ASF GitHub Bot added a comment - Github user superbstreak commented on the issue: https://github.com/apache/drill/pull/876 Thanks, @parthchandra!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user laurentgo commented on the issue:

          https://github.com/apache/drill/pull/876

          I tested non-SASL case and confirm it fixes the issue. Unfortunately I don't have any handy SASL-ready setup for now.

          Thanks @parthchandra

          Show
          githubbot ASF GitHub Bot added a comment - Github user laurentgo commented on the issue: https://github.com/apache/drill/pull/876 I tested non-SASL case and confirm it fixes the issue. Unfortunately I don't have any handy SASL-ready setup for now. Thanks @parthchandra
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user parthchandra commented on the issue:

          https://github.com/apache/drill/pull/876

          Still haven't been able to confirm the SASL encryption case, but there is no reason why this fix would affect that. Merging it in.

          Show
          githubbot ASF GitHub Bot added a comment - Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/876 Still haven't been able to confirm the SASL encryption case, but there is no reason why this fix would affect that. Merging it in.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/drill/pull/876

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/876
          Hide
          robertw Rob Wu added a comment -

          Tested on 1.9.0 Drilbit and works correctly now.
          I'll test on 1.10.0 and 1.11.0 later today.

          Show
          robertw Rob Wu added a comment - Tested on 1.9.0 Drilbit and works correctly now. I'll test on 1.10.0 and 1.11.0 later today.
          Hide
          robertw Rob Wu added a comment - - edited

          Tested Linux 1.11 master snapshot against 1.9.0 and 1.10.0 drillbits with authentication.
          Just compiled windows and tested against 1.9.0 worked correctly.
          Laurent Goujon Parth Chandra Have you tried with windows? I seemed to be getting error testing against 1.10+ drillbits when receiving server metadata in processServerMetaResult(...) resulting socket shutdown.

          Drill Client Library
          Build info:
          Loaded by process: C:\Program Files (x86)\Microsoft Data Access SDK 2.8\Tools\amd64\odbct32w.exe
          Current process id is: 9300
          Initialized Logging to file (D:\Logs\DM\drillclient20170718T173700.log). 
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Connecting to endpoint: 192.168.205.161:31010
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : 192.168.205.161:31010
          2017-Jul-18 17:37:00 : INFO    : 2d48 : Connected to endpoint: id: 396 address: 192.168.205.161:31010
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : validateHandShake
          2017-Jul-18 17:37:00 : INFO    : 2d48 : userName:mapr
          2017-Jul-18 17:37:00 : INFO    : 2d48 : password: ********** 
          2017-Jul-18 17:37:00 : INFO    : 2d48 : schema:default
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Sent handshake request message. Coordination id: 1221
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Started new handshake wait timer with 5 seconds.
          2017-Jul-18 17:37:00 : DEBUG   : 2d48 : DrillClientImpl::recvHandshake: async read waiting for server handshake response.
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Deadline timer cancelled.
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Socket read: reading 87data bytes
          2017-Jul-18 17:37:00 : TRACE   : 2d48 : Socket read: actual bytes read = 87
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Started new heartbeat timer with 15 seconds.
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : DrillClientImpl::startMessageListener: io_service is stopped. Restarting.
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : DrillClientImpl::startMessageListener: Starting listener thread: 0000000003E61200
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Getting metadata
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Server metadata supported.
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : Sent GET_SERVER_META request. [id: 396 address: 192.168.205.161:31010]Coordination id = 1222
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : Sent GET_SERVER_META Coordination id = 1222 query: server meta
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : Sending GET_SERVER_META request. Number of pending requests = 1
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Read blocked waiting for memory.
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Started new query wait timer with 180 seconds.
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Started new heartbeat timer with 15 seconds.
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : DrillClientImpl:: Heartbeat timer expired.
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : DrillClientImpl::handleRead: Handle Read from buffer 0000000003E615C0
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : DrillClientImpl::handleRead: Cancel deadline timer.
          2017-Jul-18 17:37:03 : DEBUG   : 2d48 : DrillClientImpl::getNextResult: async_read from the server
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Server metadata received.
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Getting new message
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : DrillClientImpl::readMsg: Read message from buffer 0000000003E615C0
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Length bytes = 2
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Msg Length = 5637
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ 0000000003E11CB0, size = 5639 ]
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Copied bufWithLen into bufferWithLenBytes. Now reading data (rmsgLen - leftover) : 5634
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Socket read: reading 5634data bytes
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Getting metadata
          2017-Jul-18 17:37:03 : TRACE   : 2d48 : Server metadata supported.
          2017-Jul-18 17:37:03 : TRACE   : 3fc4 : Socket read: actual bytes read = 5634
          2017-Jul-18 17:37:06 : TRACE   : 3fc4 : Successfully created a RPC message with Coordination id: 1222
          2017-Jul-18 17:37:06 : TRACE   : 3fc4 : DrillClientImpl::readMsg: Free buffer 0000000003E615C0
          2017-Jul-18 17:37:10 : DEBUG   : 3fc4 : Processing GetServerMetaResp with coordination id:1222
          2017-Jul-18 17:37:29 : TRACE   : 3fc4 : Socket shutdown
          2017-Jul-18 17:37:30 : TRACE   : 3fc4 : DrillClientImpl:: Heartbeat timer expired.
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Socket shutdown
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Server metadata received.
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Getting metadata
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Server metadata supported.
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Socket shutdown
          2017-Jul-18 17:37:30 : TRACE   : 2d48 : Server metadata received.
          
          Show
          robertw Rob Wu added a comment - - edited Tested Linux 1.11 master snapshot against 1.9.0 and 1.10.0 drillbits with authentication. Just compiled windows and tested against 1.9.0 worked correctly. Laurent Goujon Parth Chandra Have you tried with windows? I seemed to be getting error testing against 1.10+ drillbits when receiving server metadata in processServerMetaResult(...) resulting socket shutdown. Drill Client Library Build info: Loaded by process: C:\Program Files (x86)\Microsoft Data Access SDK 2.8\Tools\amd64\odbct32w.exe Current process id is: 9300 Initialized Logging to file (D:\Logs\DM\drillclient20170718T173700.log). 2017-Jul-18 17:37:00 : TRACE : 2d48 : Connecting to endpoint: 192.168.205.161:31010 2017-Jul-18 17:37:00 : TRACE : 2d48 : 192.168.205.161:31010 2017-Jul-18 17:37:00 : INFO : 2d48 : Connected to endpoint: id: 396 address: 192.168.205.161:31010 2017-Jul-18 17:37:00 : TRACE : 2d48 : validateHandShake 2017-Jul-18 17:37:00 : INFO : 2d48 : userName:mapr 2017-Jul-18 17:37:00 : INFO : 2d48 : password: ********** 2017-Jul-18 17:37:00 : INFO : 2d48 : schema: default 2017-Jul-18 17:37:00 : TRACE : 2d48 : Sent handshake request message. Coordination id: 1221 2017-Jul-18 17:37:00 : TRACE : 2d48 : Started new handshake wait timer with 5 seconds. 2017-Jul-18 17:37:00 : DEBUG : 2d48 : DrillClientImpl::recvHandshake: async read waiting for server handshake response. 2017-Jul-18 17:37:00 : TRACE : 2d48 : Deadline timer cancelled. 2017-Jul-18 17:37:00 : TRACE : 2d48 : Socket read: reading 87data bytes 2017-Jul-18 17:37:00 : TRACE : 2d48 : Socket read: actual bytes read = 87 2017-Jul-18 17:37:03 : TRACE : 2d48 : Started new heartbeat timer with 15 seconds. 2017-Jul-18 17:37:03 : DEBUG : 2d48 : DrillClientImpl::startMessageListener: io_service is stopped. Restarting. 2017-Jul-18 17:37:03 : DEBUG : 2d48 : DrillClientImpl::startMessageListener: Starting listener thread: 0000000003E61200 2017-Jul-18 17:37:03 : TRACE : 2d48 : Getting metadata 2017-Jul-18 17:37:03 : TRACE : 2d48 : Server metadata supported. 2017-Jul-18 17:37:03 : DEBUG : 2d48 : Sent GET_SERVER_META request. [id: 396 address: 192.168.205.161:31010]Coordination id = 1222 2017-Jul-18 17:37:03 : DEBUG : 2d48 : Sent GET_SERVER_META Coordination id = 1222 query: server meta 2017-Jul-18 17:37:03 : DEBUG : 2d48 : Sending GET_SERVER_META request. Number of pending requests = 1 2017-Jul-18 17:37:03 : TRACE : 2d48 : Read blocked waiting for memory. 2017-Jul-18 17:37:03 : TRACE : 2d48 : Started new query wait timer with 180 seconds. 2017-Jul-18 17:37:03 : TRACE : 2d48 : Started new heartbeat timer with 15 seconds. 2017-Jul-18 17:37:03 : TRACE : 3fc4 : DrillClientImpl:: Heartbeat timer expired. 2017-Jul-18 17:37:03 : TRACE : 3fc4 : DrillClientImpl::handleRead: Handle Read from buffer 0000000003E615C0 2017-Jul-18 17:37:03 : TRACE : 3fc4 : DrillClientImpl::handleRead: Cancel deadline timer. 2017-Jul-18 17:37:03 : DEBUG : 2d48 : DrillClientImpl::getNextResult: async_read from the server 2017-Jul-18 17:37:03 : TRACE : 2d48 : Server metadata received. 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Getting new message 2017-Jul-18 17:37:03 : TRACE : 3fc4 : DrillClientImpl::readMsg: Read message from buffer 0000000003E615C0 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Length bytes = 2 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Msg Length = 5637 2017-Jul-18 17:37:03 : TRACE : 3fc4 : DrillClientImpl::readLenBytesFromSocket: Allocated and locked buffer: [ 0000000003E11CB0, size = 5639 ] 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Copied bufWithLen into bufferWithLenBytes. Now reading data (rmsgLen - leftover) : 5634 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Socket read: reading 5634data bytes 2017-Jul-18 17:37:03 : TRACE : 2d48 : Getting metadata 2017-Jul-18 17:37:03 : TRACE : 2d48 : Server metadata supported. 2017-Jul-18 17:37:03 : TRACE : 3fc4 : Socket read: actual bytes read = 5634 2017-Jul-18 17:37:06 : TRACE : 3fc4 : Successfully created a RPC message with Coordination id: 1222 2017-Jul-18 17:37:06 : TRACE : 3fc4 : DrillClientImpl::readMsg: Free buffer 0000000003E615C0 2017-Jul-18 17:37:10 : DEBUG : 3fc4 : Processing GetServerMetaResp with coordination id:1222 2017-Jul-18 17:37:29 : TRACE : 3fc4 : Socket shutdown 2017-Jul-18 17:37:30 : TRACE : 3fc4 : DrillClientImpl:: Heartbeat timer expired. 2017-Jul-18 17:37:30 : TRACE : 2d48 : Socket shutdown 2017-Jul-18 17:37:30 : TRACE : 2d48 : Server metadata received. 2017-Jul-18 17:37:30 : TRACE : 2d48 : Getting metadata 2017-Jul-18 17:37:30 : TRACE : 2d48 : Server metadata supported. 2017-Jul-18 17:37:30 : TRACE : 2d48 : Socket shutdown 2017-Jul-18 17:37:30 : TRACE : 2d48 : Server metadata received.
          Hide
          robertw Rob Wu added a comment -

          I have filed a ticket for the 1.10 undefined behavior at https://issues.apache.org/jira/browse/DRILL-5678 and supplied a patch at https://github.com/apache/drill/pull/880

          Show
          robertw Rob Wu added a comment - I have filed a ticket for the 1.10 undefined behavior at https://issues.apache.org/jira/browse/DRILL-5678 and supplied a patch at https://github.com/apache/drill/pull/880

            People

            • Assignee:
              parthc Parth Chandra
              Reporter:
              robertw Rob Wu
              Reviewer:
              Laurent Goujon
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development