Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4490

Check failed: false Unexpected plan node with runtime filters

    Details

      Description

      The random query generator triggered the DCHECK in the summary with this stack:

      #0  0x00007f41a827cc37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
      #1  0x00007f41a8280028 in __GI_abort () at abort.c:89
      #2  0x0000000002807db4 in google::DumpStackTraceAndExit() ()
      #3  0x000000000280121d in google::LogMessage::Fail() ()
      #4  0x0000000002803b46 in google::LogMessage::SendToLog() ()
      #5  0x0000000002800d3d in google::LogMessage::Flush() ()
      #6  0x00000000028045ee in google::LogMessageFatal::~LogMessageFatal() ()
      #7  0x000000000198e70b in impala::Coordinator::UpdateFilterRoutingTable (this=0xa2f8000, fragment_params=...)
          at /home/dev/Impala/be/src/runtime/coordinator.cc:573
      #8  0x000000000198ed03 in impala::Coordinator::StartFInstances (this=0xa2f8000) at /home/dev/Impala/be/src/runtime/coordinator.cc:607
      #9  0x000000000198d580 in impala::Coordinator::Exec (this=0xa2f8000) at /home/dev/Impala/be/src/runtime/coordinator.cc:485
      #10 0x00000000014e2d06 in impala::ImpalaServer::QueryExecState::ExecQueryOrDmlRequest (this=0xf222000, query_exec_request=...)
          at /home/dev/Impala/be/src/service/query-exec-state.cc:445
      #11 0x00000000014dfd99 in impala::ImpalaServer::QueryExecState::Exec (this=0xf222000, exec_request=0x7f4140f06940)
          at /home/dev/Impala/be/src/service/query-exec-state.cc:154
      #12 0x000000000146d8b4 in impala::ImpalaServer::ExecuteInternal (this=0xf0ce600, query_ctx=..., session_state=..., registered_exec_state=0x7f4140f07f8f,
          exec_state=0x7f4140f08330) at /home/dev/Impala/be/src/service/impala-server.cc:815
      #13 0x000000000146d10c in impala::ImpalaServer::Execute (this=0xf0ce600, query_ctx=0x7f4140f08020, session_state=..., exec_state=0x7f4140f08330)
          at /home/dev/Impala/be/src/service/impala-server.cc:762
      #14 0x00000000014d4d96 in impala::ImpalaServer::query (this=0xf0ce600, query_handle=..., query=...)
          at /home/dev/Impala/be/src/service/impala-beeswax-server.cc:66
      #15 0x000000000192907c in beeswax::BeeswaxServiceProcessor::process_query (this=0xee474a0, seqid=0, iprot=0xee74660, oprot=0xefcd830,
          callContext=0x10144180) at /home/dev/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2979
      #16 0x0000000001928dca in beeswax::BeeswaxServiceProcessor::dispatchCall (this=0xee474a0, iprot=0xee74660, oprot=0xefcd830, fname=..., seqid=0,
          callContext=0x10144180) at /home/dev/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2952
      #17 0x0000000001912bc7 in impala::ImpalaServiceProcessor::dispatchCall (this=0xee474a0, iprot=0xee74660, oprot=0xefcd830, fname=..., seqid=0,
          callContext=0x10144180) at /home/dev/Impala/be/generated-sources/gen-cpp/ImpalaService.cpp:1673
      #18 0x00000000011630fc in apache::thrift::TDispatchProcessor::process (this=0xee474a0, in=..., out=..., connectionContext=0x10144180)
          at /opt/Impala-Toolchain/thrift-0.9.0-p8/include/thrift/TDispatchProcessor.h:121
      #19 0x00000000027b40eb in apache::thrift::server::TThreadPoolServer::Task::run() ()
      #20 0x000000000279c959 in apache::thrift::concurrency::ThreadManager::Worker::run() ()
      #21 0x000000000132719f in impala::ThriftThread::RunRunnable (this=0x1013c640, runnable=..., promise=0x7fff332e0650)
          at /home/dev/Impala/be/src/rpc/thrift-thread.cc:64
      #22 0x00000000013288cb in boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>::operator() (this=0x10150030, p=0x1013c640, a1=..., a2=0x7fff332e0650) at /opt/Impala-Toolchain/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:280
      #23 0x0000000001328761 in boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> >::operator()<boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list0> (this=0x10150040, f=..., a=...)
          at /opt/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind.hpp:392
      #24 0x00000000013284ad in boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >::operator() (this=0x10150030)
          at /opt/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
      #25 0x00000000013283c0 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >, void>::invoke (function_obj_ptr=...)
          at /opt/Impala-Toolchain/boost-1.57.0/include/boost/function/function_template.hpp:153
      #26 0x0000000001335d3e in boost::function0<void>::operator() (this=0x7f4140f08da0)
          at /opt/Impala-Toolchain/boost-1.57.0/include/boost/function/function_template.hpp:767
      #27 0x00000000015e017d in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) (
          name=..., category=..., functor=..., thread_started=0x7fff332e0440) at /home/dev/Impala/be/src/util/thread.cc:317
      #28 0x00000000015e7156 in boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> >::operator()<void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list0&, int) (this=0xf2855c0,
          f=@0xf2855b8: 0x15dfeb8 <impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*)>,
          a=...) at /opt/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind.hpp:457
      #29 0x00000000015e7099 in boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > >::operator()() (this=0xf2855b8) at /opt/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
      #30 0x00000000015e6ff4 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() (this=0xf285400) at /opt/Impala-Toolchain/boost-1.57.0/include/boost/thread/detail/thread.hpp:116
      #31 0x0000000001a3282a in thread_proxy ()
      #32 0x00007f41a8613184 in start_thread (arg=0x7f4140f09700) at pthread_create.c:312
      #33 0x00007f41a834037d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
      

      The query is:

      USE randomness;
      
      SELECT
      MIN(a1.int_col_32) AS int_col,
      MIN(a2.timestamp_col_17) + INTERVAL COALESCE(370, -275, 254) MONTH AS timestamp_col
      FROM table_2 a1
      LEFT JOIN table_10 a2 ON (a1.tinyint_col_11) IS DISTINCT FROM (a2.int_col_22)
      WHERE
      ((a1.int_col_32) - (a1.bigint_col_6)) = (a2.int_col_22);
      

      Plan:

      +--------------------------------------------------------------------------+
      | Explain String                                                           |
      +--------------------------------------------------------------------------+
      | Estimated Per-Host Requirements: Memory=430.00MB VCores=2                |
      |                                                                          |
      | PLAN-ROOT SINK                                                           |
      | |                                                                        |
      | 06:AGGREGATE [FINALIZE]                                                  |
      | |  output: min:merge(a1.int_col_32), min:merge(a2.timestamp_col_17)      |
      | |                                                                        |
      | 05:EXCHANGE [UNPARTITIONED]                                              |
      | |                                                                        |
      | 03:AGGREGATE                                                             |
      | |  output: min(a1.int_col_32), min(a2.timestamp_col_17)                  |
      | |                                                                        |
      | 02:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST]                         |
      | |  join predicates: (a1.tinyint_col_11) IS DISTINCT FROM (a2.int_col_22) |
      | |  predicates: ((a1.int_col_32) - (a1.bigint_col_6)) = (a2.int_col_22)   |
      | |                                                                        |
      | |--04:EXCHANGE [BROADCAST]                                               |
      | |  |                                                                     |
      | |  01:SCAN HDFS [randomness.table_10 a2]                                 |
      | |     partitions=1/1 files=1 size=199.35MB                               |
      | |                                                                        |
      | 00:SCAN HDFS [randomness.table_2 a1]                                     |
      |    partitions=1/1 files=1 size=68.25MB                                   |
      |    runtime filters: RF000 -> ((a1.int_col_32) - (a1.bigint_col_6))       |
      +--------------------------------------------------------------------------+
      

      The logs include the multi-hundred-line text dump of the plan node, which is part of the DCHECK string. It seemed too long to paste or comb through here.

      There's also a running Docker container that has the randomness database loaded. In that container, I was able to reproduce the DCHECK using the query above.

        Activity

        Hide
        alex.behm Alexander Behm added a comment -

        Nice one, Michael!

        Show
        alex.behm Alexander Behm added a comment - Nice one, Michael!
        Hide
        alex.behm Alexander Behm added a comment -

        Min pepro on functional db:

        select 1 from
        functional.alltypes a left join functional.alltypessmall b
          on a.id IS DISTINCT FROM b.id
        where a.int_col + a.bigint_col = b.int_col
        
        +-----------------------------------------------------------+
        | Explain String                                            |
        +-----------------------------------------------------------+
        | Estimated Per-Host Requirements: Memory=192.00MB VCores=2 |
        |                                                           |
        | PLAN-ROOT SINK                                            |
        | |                                                         |
        | 04:EXCHANGE [UNPARTITIONED]                               |
        | |                                                         |
        | 02:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST]          |
        | |  join predicates: a.id IS DISTINCT FROM b.id            |
        | |  predicates: a.int_col + a.bigint_col = b.int_col       |
        | |                                                         |
        | |--03:EXCHANGE [BROADCAST]                                |
        | |  |                                                      |
        | |  01:SCAN HDFS [functional.alltypessmall b]              |
        | |     partitions=4/4 files=4 size=6.32KB                  |
        | |                                                         |
        | 00:SCAN HDFS [functional.alltypes a]                      |
        |    partitions=24/24 files=24 size=478.45KB                |
        |    runtime filters: RF000 -> a.int_col + a.bigint_col     |
        +-----------------------------------------------------------+
        
        Show
        alex.behm Alexander Behm added a comment - Min pepro on functional db: select 1 from functional.alltypes a left join functional.alltypessmall b on a.id IS DISTINCT FROM b.id where a.int_col + a.bigint_col = b.int_col +-----------------------------------------------------------+ | Explain String | +-----------------------------------------------------------+ | Estimated Per-Host Requirements: Memory=192.00MB VCores=2 | | | | PLAN-ROOT SINK | | | | | 04:EXCHANGE [UNPARTITIONED] | | | | | 02:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST] | | | join predicates: a.id IS DISTINCT FROM b.id | | | predicates: a.int_col + a.bigint_col = b.int_col | | | | | |--03:EXCHANGE [BROADCAST] | | | | | | | 01:SCAN HDFS [functional.alltypessmall b] | | | partitions=4/4 files=4 size=6.32KB | | | | | 00:SCAN HDFS [functional.alltypes a] | | partitions=24/24 files=24 size=478.45KB | | runtime filters: RF000 -> a.int_col + a.bigint_col | +-----------------------------------------------------------+
        Hide
        alex.behm Alexander Behm added a comment -

        commit 263f222557d5931374c2ea453d4c3d9ad1eafa70
        Author: Alex Behm <alex.behm@cloudera.com>
        Date: Wed Nov 16 16:44:52 2016 -0800

        IMPALA-4490: Only generate runtime filters for hash join nodes.

        Change-Id: I167725e260bd0f91c2bfc164eb044321192d5b95
        Reviewed-on: http://gerrit.cloudera.org:8080/5117
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        alex.behm Alexander Behm added a comment - commit 263f222557d5931374c2ea453d4c3d9ad1eafa70 Author: Alex Behm <alex.behm@cloudera.com> Date: Wed Nov 16 16:44:52 2016 -0800 IMPALA-4490 : Only generate runtime filters for hash join nodes. Change-Id: I167725e260bd0f91c2bfc164eb044321192d5b95 Reviewed-on: http://gerrit.cloudera.org:8080/5117 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            alex.behm Alexander Behm
            Reporter:
            mikesbrown Michael Brown
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development