Details
-
Sub-task
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
-
ghx-label-8
Description
Before we finish IMPALA-10798, FE should not generate any plans scanning the JSON table. Otherwise, it leads to a crash in BE. I can simply reproduce the crash:
Create a JSON table in Impala:
create table my_json_tbl (id int, name string, age int) stored as jsonfile;
Upload a JSON file into its directory:
$ cat id_name_age.json {"id": 0, "name": "Alice", "age", 10} {"id": 1, "name": "Bob", "age", 20} {"id": 2, "name": "Oracle", "age", 16} $ hdfs dfs -put id_name_age.json hdfs://localhost:20500/test-warehouse/my_json_tbl
Querying the table in Impala
[localhost:21050] default> refresh my_json_tbl; [localhost:21050] default> show files in my_json_tbl; +--------------------------------------------------------------------+------+-----------+ | Path | Size | Partition | +--------------------------------------------------------------------+------+-----------+ | hdfs://localhost:20500/test-warehouse/my_json_tbl/id_name_age.json | 113B | | +--------------------------------------------------------------------+------+-----------+ [localhost:21050] default> select * from my_json_tbl; Query: select * from my_json_tbl Query submitted at: 2022-02-22 13:50:04 (Coordinator: http://quanlong-OptiPlex-BJ:25000) Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=7c427bd4bc29881a:702a2a9b00000000 [ ] 0% ERROR: Failed due to unreachable impalad(s): quanlong-OptiPlex-BJ:27002
The impalad at port 27002 crashed. Looking into its ERROR log file, i.e. logs/cluster/impalad_node2.ERROR:
F0222 13:50:05.099845 4112 hdfs-scan-node-base.cc:693] 7c427bd4bc29881a:702a2a9b00000001] Check failed: false Unexpected file type JSON *** Check failure stack trace: *** @ 0x570012c google::LogMessage::Fail() @ 0x57019dc google::LogMessage::SendToLog() @ 0x56ffa8a google::LogMessage::Flush() @ 0x5703648 google::LogMessageFatal::~LogMessageFatal() @ 0x2b6dadc impala::HdfsScanNodeBase::IssueInitialScanRanges() @ 0x2d478bb impala::HdfsScanNode::GetNext() @ 0x24ee836 impala::FragmentInstanceState::ExecInternal() @ 0x24eaa9b impala::FragmentInstanceState::Exec() @ 0x24258c1 impala::QueryState::ExecFInstance() @ 0x2423cbf _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv @ 0x242894b _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE @ 0x22c5769 boost::function0<>::operator()() @ 0x2a99f3a impala::Thread::SuperviseThread() @ 0x2aa288a boost::_bi::list5<>::operator()<>() @ 0x2aa27ae boost::_bi::bind_t<>::operator()() @ 0x2aa276f boost::detail::thread_data<>::run() @ 0x43a8d10 thread_proxy @ 0x7fd9616c36b9 start_thread @ 0x7fd95e0ff4dc clone
FE should reject such SELECT queries and return an error like "Reading JSON files is unsupported yet".