Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-10725 Support JSON format tables
  3. IMPALA-11145

Block reads on JSON table until we support it

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • Frontend
    • None

    Description

      Before we finish IMPALA-10798, FE should not generate any plans scanning the JSON table. Otherwise, it leads to a crash in BE. I can simply reproduce the crash:

      Create a JSON table in Impala:

      create table my_json_tbl (id int, name string, age int) stored as jsonfile;
      

      Upload a JSON file into its directory:

      $ cat id_name_age.json 
      {"id": 0, "name": "Alice", "age", 10}
      {"id": 1, "name": "Bob", "age", 20}
      {"id": 2, "name": "Oracle", "age", 16}
      
      $ hdfs dfs -put id_name_age.json hdfs://localhost:20500/test-warehouse/my_json_tbl
      

      Querying the table in Impala

      [localhost:21050] default> refresh my_json_tbl;
      [localhost:21050] default> show files in my_json_tbl;
      +--------------------------------------------------------------------+------+-----------+
      | Path                                                               | Size | Partition |
      +--------------------------------------------------------------------+------+-----------+
      | hdfs://localhost:20500/test-warehouse/my_json_tbl/id_name_age.json | 113B |           |
      +--------------------------------------------------------------------+------+-----------+
      [localhost:21050] default> select * from my_json_tbl;
      Query: select * from my_json_tbl
      Query submitted at: 2022-02-22 13:50:04 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
      Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=7c427bd4bc29881a:702a2a9b00000000
      [                                                                                                    ] 0%
      ERROR: Failed due to unreachable impalad(s): quanlong-OptiPlex-BJ:27002
      

      The impalad at port 27002 crashed. Looking into its ERROR log file, i.e. logs/cluster/impalad_node2.ERROR:

      F0222 13:50:05.099845  4112 hdfs-scan-node-base.cc:693] 7c427bd4bc29881a:702a2a9b00000001] Check failed: false Unexpected file type JSON
      *** Check failure stack trace: *** 
          @          0x570012c  google::LogMessage::Fail()
          @          0x57019dc  google::LogMessage::SendToLog()
          @          0x56ffa8a  google::LogMessage::Flush()
          @          0x5703648  google::LogMessageFatal::~LogMessageFatal()
          @          0x2b6dadc  impala::HdfsScanNodeBase::IssueInitialScanRanges()
          @          0x2d478bb  impala::HdfsScanNode::GetNext()
          @          0x24ee836  impala::FragmentInstanceState::ExecInternal()
          @          0x24eaa9b  impala::FragmentInstanceState::Exec()
          @          0x24258c1  impala::QueryState::ExecFInstance()
          @          0x2423cbf  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
          @          0x242894b  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
          @          0x22c5769  boost::function0<>::operator()()
          @          0x2a99f3a  impala::Thread::SuperviseThread()
          @          0x2aa288a  boost::_bi::list5<>::operator()<>()
          @          0x2aa27ae  boost::_bi::bind_t<>::operator()()
          @          0x2aa276f  boost::detail::thread_data<>::run()
          @          0x43a8d10  thread_proxy
          @     0x7fd9616c36b9  start_thread
          @     0x7fd95e0ff4dc  clone
      

      FE should reject such SELECT queries and return an error like "Reading JSON files is unsupported yet".

      Attachments

        Activity

          People

            pranav.lodha Pranav Yogi Lodha
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: