Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11177

crash in useAsyncIoForStream due to unknown orc::StreamKind

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Impala 4.1.0
    • Impala 4.1.0
    • Backend
    • None
    • ghx-label-9

    Description

      Hit a DCHECK in useAsyncIoForStream() in an irrelevant build: https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/5391

      Stacktrace:

      F0310 11:55:24.087487 15832 hdfs-orc-scanner.cc:183] 0b4eaeb37f8e5d6b:84d11bab00000003] Check failed: false 
      *** Check failure stack trace: ***
          @          0x574005c  google::LogMessage::Fail()
          @          0x574190c  google::LogMessage::SendToLog()
          @          0x573f9ba  google::LogMessage::Flush()
          @          0x5743578  google::LogMessageFatal::~LogMessageFatal()
          @          0x2c0d5e3  impala::useAsyncIoForStream()
          @          0x2c0d77f  impala::HdfsOrcScanner::StartColumnReading()
          @          0x2c15c36  impala::HdfsOrcScanner::NextStripe()
          @          0x2c1503f  impala::HdfsOrcScanner::GetNextInternal()
          @          0x2c13e12  impala::HdfsOrcScanner::ProcessSplit()
          @          0x2d7266a  impala::HdfsScanNode::ProcessSplit()
          @          0x2d719ec  impala::HdfsScanNode::ScannerThread()
          @          0x2d70d49  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
          @          0x2d73975  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
          @          0x23df6b3  boost::function0<>::operator()()
          @          0x2abe062  impala::Thread::SuperviseThread()
          @          0x2ac6a9a  boost::_bi::list5<>::operator()<>()
          @          0x2ac69be  boost::_bi::bind_t<>::operator()()
          @          0x2ac697f  boost::detail::thread_data<>::run()
          @          0x43e9dd0  thread_proxy
          @     0x7f15bf27e6b9  start_thread
          @     0x7f15bbd8451c  clone 
       

      The query is

      I0310 11:55:23.958173 29215 Frontend.java:1636] 0b4eaeb37f8e5d6b:84d11bab00000000] Analyzing query: select count(*) from (select distinct * from test_fuzz_alltypes_b98dffcf.alltypes) q db: functional_orc_def
      

      Come from test_scanners_fuzz.py::TestScannersFuzzing::test_fuzz_alltypes

        74: client_identifier (string) = "query_test/test_scanners_fuzz.py::TestScannersFuzzing::()::test_fuzz_alltypes[protocol:beeswax|exec_option:{'debug_action':'-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5';'abort_on_error':False;'mem_limit':'512m';'num_nodes':0}|table_format:orc/def/block]",
      

      The underlying ORC files are malformed. I think we should return false in such cases and let the ORC lib returns errors later.

      Attachments

        Issue Links

          Activity

            People

              rizaon Riza Suminto
              stigahuang Quanlong Huang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: