Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6291

Various crashes and incorrect results on CPUs with AVX512

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
    • Impala 2.11.0
    • Backend
    • Ubuntu 16.04, M5.4xlarge

    Description

      M5 and C5 instances use a different hypervisor than M4 and C4. In EC2 C5 and M5 instances, data loading fails. An interesting snippet from the end of an impalad log:

      I1207 04:12:07.922456 19933 coordinator.cc:99] Exec() query_id=944ead2f178cf67e:1755131f00000000 stmt=CREATE TABLE tmp_orders_string AS 
                SELECT STRAIGHT_JOIN
                  o_orderkey, o_custkey, o_orderstatus, o_totalprice, o_orderdate,
                  o_orderpriority, o_clerk, o_shippriority, o_comment,
                  GROUP_CONCAT(
                    CONCAT(
                      CAST(l_partkey AS STRING), '\005',
                      CAST(l_suppkey AS STRING), '\005',
                      CAST(l_linenumber AS STRING), '\005',
                      CAST(l_quantity AS STRING), '\005',
                      CAST(l_extendedprice AS STRING), '\005',
                      CAST(l_discount AS STRING), '\005',
                      CAST(l_tax AS STRING), '\005',
                      CAST(l_returnflag AS STRING), '\005',
                      CAST(l_linestatus AS STRING), '\005',
                      CAST(l_shipdate AS STRING), '\005',
                      CAST(l_commitdate AS STRING), '\005',
                      CAST(l_receiptdate AS STRING), '\005',
                      CAST(l_shipinstruct AS STRING), '\005',
                      CAST(l_shipmode AS STRING), '\005',
                      CAST(l_comment AS STRING)
                    ), '\004'
                  ) AS lineitems_string
                FROM tpch_parquet.lineitem
                INNER JOIN [SHUFFLE] tpch_parquet.orders ON o_orderkey = l_orderkey
                WHERE o_orderkey % 1 = 0
                GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9
      ...
      F1207 04:12:08.972215 19953 partitioned-hash-join-node.cc:291] Check failed: probe_batch_pos_ == probe_batch_->num_rows() || probe_batch_pos_ == -1 
      

      The error log shows:

      F1207 04:12:08.972215 19953 partitioned-hash-join-node.cc:291] Check failed: probe_batch_pos_ == probe_batch_->num_rows() || probe_batch_pos_ == -1 
      *** Check failure stack trace: ***
          @          0x3bdcefd  google::LogMessage::Fail()
          @          0x3bde7a2  google::LogMessage::SendToLog()
          @          0x3bdc8d7  google::LogMessage::Flush()
          @          0x3bdfe9e  google::LogMessageFatal::~LogMessageFatal()
          @          0x28bd4db  impala::PartitionedHashJoinNode::NextProbeRowBatch()
          @          0x28c1741  impala::PartitionedHashJoinNode::GetNext()
          @          0x289f71f  impala::PartitionedAggregationNode::GetRowsStreaming()
          @          0x289d8d5  impala::PartitionedAggregationNode::GetNext()
          @          0x1891d1c  impala::FragmentInstanceState::ExecInternal()
          @          0x188f629  impala::FragmentInstanceState::Exec()
          @          0x1878c0a  impala::QueryState::ExecFInstance()
          @          0x18774cc  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
          @          0x1879849  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
          @          0x17c64ba  boost::function0<>::operator()()
          @          0x1abb5a1  impala::Thread::SuperviseThread()
          @          0x1ac412c  boost::_bi::list4<>::operator()<>()
          @          0x1ac406f  boost::_bi::bind_t<>::operator()()
          @          0x1ac4032  boost::detail::thread_data<>::run()
          @          0x2d668ca  thread_proxy
          @     0x7fe9287146ba  start_thread
          @     0x7fe92844a3dd  clone
      Picked up JAVA_TOOL_OPTIONS: -agentlib:jdwp=transport=dt_socket,address=30002,server=y,suspend=n 
      

      To reproduce this, start a M5.4xlarge with 250GB space

      sudo apt-get update
      sudo apt-get install --yes git
      git init ~/Impala
      pushd ~/Impala
      git fetch https://github.com/apache/impala master
      git checkout FETCH_HEAD
      ./bin/bootstrap_development.sh | tee -a $(mktemp -p ~)
      

      You might need to fiddle with the default security group; I'm not sure. You can test on an M4.4xlarge, since the above script should work there.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tarmstrong Tim Armstrong
            jbapple Jim Apple
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment