Uploaded image for project: 'Apache Trafodion (Retired)'
  1. Apache Trafodion (Retired)
  2. TRAFODION-1093

LP Bug: 1436994 - mxosrvr core generated during select count(*)




      Encountered on cluster with 20150324_0830 build.

      From client-side, trafci session:
      SQL>select count from nosalt1reg;

          • ERROR[1] The message id: problem_with_server_read
          • ERROR[1] The message id: header_not_long_enough
          • ERROR[1] The message id: problem_with_server_read
          • ERROR[1] The message id: header_not_long_enough

      Stack trace from core:
      Core was generated by `mxosrvr -ZKHOST rhel-qa1.hpl.hp.com:2181 -RZ rhel-qa2.hpl.hp.com:5:1 -ZKPNODE /'.
      Program terminated with signal 6, Aborted.
      #0 0x0000003e8d632925 in raise () from /lib64/libc.so.6
      Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.132.el6_5.2.x86_64 hadoop-2.5.0+cdh5.3.2+813-1.cdh5.3.2.p0.17.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-15.el6_5.1.x86_64 libcom_err-1.41.12-18.el6.x86_64 libgcc-4.4.7-4.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 libstdc++-4.4.7-4.el6.x86_64 nspr-4.10.2-1.el6_5.x86_64 nss-3.15.3-6.el6_5.x86_64 nss-softokn-freebl-3.14.3-10.el6_5.x86_64 nss-util-3.15.3-1.el6_5.x86_64 openldap-2.4.23-34.el6_5.1.x86_64 openssl-1.0.1e-16.el6_5.14.x86_64 zlib-1.2.3-29.el6.x86_64
      (gdb) where
      #0 0x0000003e8d632925 in raise () from /lib64/libc.so.6
      #1 0x0000003e8d63408d in abort () from /lib64/libc.so.6
      #2 0x00007f5441f6ca55 in os::abort(bool) ()
      from /usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #3 0x00007f54420ecf87 in VMError::report_and_die() ()
      from /usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #4 0x00007f5441f7196f in JVM_handle_linux_signal ()
      from /usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #5 <signal handler called>
      #6 0x00000000004c4b88 in odbc_SQLSrvr_ExecDirect_ame_ (objtag_=0x2af42b0,
      call_id_=0x2af4308, dialogueId=479758385, stmtLabel=0x2fcda50 "SQL_CUR_2",
      cursorName=<value optimized out>, stmtExplainLabel=<value optimized out>,
      stmtType=0, sqlStmtType=1,
      sqlString=0x2fcda24 "select count from nosalt1reg", sqlAsyncEnable=0,
      queryTimeout=0, inputRowCnt=0, txnID=0, holdableCursor=0)
      at SrvrConnect.cpp:7785
      #7 0x0000000000493aa6 in SQLEXECUTE_IOMessage (objtag_=0x2af42b0,
      call_id_=0x2af4308, operation_id=3012) at Interface/odbcs_srvr.cpp:1728
      #8 0x0000000000493b54 in DISPATCH_TCPIPRequest (objtag_=0x2af42b0,
      call_id_=0x2af4308, operation_id=<value optimized out>)
      at Interface/odbcs_srvr.cpp:1793
      #9 0x0000000000433872 in BUILD_TCPIP_REQUEST (pnode=0x2af42b0)
      at ../Common/TCPIPSystemSrvr.cpp:603
      #10 0x000000000043420d in PROCESS_TCPIP_REQUEST (pnode=0x2af42b0)
      at ../Common/TCPIPSystemSrvr.cpp:581
      #11 0x00000000004623f6 in CNSKListenerSrvr::tcpip_listener (arg=0x29bd0d0)
      at Interface/linux/Listener_srvr_ps.cpp:400
      #12 0x00007f5441522290 in sb_thread_sthr_disp (pp_arg=0x2acde60)
      at threadl.cpp:253
      #13 0x0000003e8da079d1 in start_thread () from /lib64/libpthread.so.0
      #14 0x0000003e8d6e8b5d in clone () from /lib64/libc.so.6

      Arvind's initial investigation:
      Seems like statement allocate failed – pSrvrStmt is null.

      2015-03-25 13:44:28,728, ERROR, MXOSRVR, Node Number: 4, CPU: 4, PIN:26182, Process Name:$Z040LD2 , , ,A SQL/MX query failed. SQLCODE: SQL statement allocate failed, sqlcode = -8811, sql error = *** ERROR[8811] Trying to close a statement that is either not in the open state or has not reached EOF. [2015-03-25 13:44:28].
      Error Text: <5>.

      (gdb) p pSrvrStmt
      $18 = (SRVR_STMT_HDL *) 0x0




            daniellu Daniel Lu
            thaiju Julie Thai
            0 Vote for this issue
            2 Start watching this issue