Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-12295

Statestore crashed when restarting catalogd

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • Impala 4.3.0
    • Backend
    • None

    Description

      I restart catalogd in my dev env and see statestore crashed.

      $ bin/start-impala-cluster.py --restart_catalogd_only 
      19:39:44 MainThread: Found 3 impalad/1 statestored/0 catalogd process(es)
      19:39:44 MainThread: Starting Catalog Service logging to /home/quanlong/workspace/Impala/logs/cluster/catalogd.INFO
      19:39:47 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:48 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:49 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:50 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:51 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:53 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:54 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:55 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:56 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:57 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:58 MainThread: Found 3 impalad/0 statestored/1 catalogd process(es)
      19:39:58 MainThread: Error starting cluster
      Traceback (most recent call last):
        File "bin/start-impala-cluster.py", line 930, in <module>
          expected_cluster_size - expected_catalog_delays)
        File "/home/quanlong/workspace/Impala/tests/common/impala_cluster.py", line 185, in wait_until_ready
          self.wait_for_num_impalads(expected_num_impalads)
        File "/home/quanlong/workspace/Impala/tests/common/impala_cluster.py", line 231, in wait_for_num_impalads
          raise RuntimeError(msg)
      RuntimeError: statestored failed to start. 

      Check statestored.ERROR and see the DCHECK failure:

      F0718 19:39:47.096524 11460 statestore-catalogd-mgr.cc:58] Check failed: num_registered_catalogd_ < 2 
      *** Check failure stack trace: *** 
          @          0x38371ed  google::LogMessage::Fail()
          @          0x3839124  google::LogMessage::SendToLog()
          @          0x3836bcc  google::LogMessage::Flush()
          @          0x3839649  google::LogMessageFatal::~LogMessageFatal()
          @          0x17c3540  impala::StatestoreCatalogdMgr::RegisterCatalogd()
          @          0x17a32fe  impala::Statestore::RegisterSubscriber()
          @          0x17c27ed  StatestoreThriftIf::RegisterSubscriber()
          @          0x17c0a92  impala::StatestoreServiceProcessorT<>::process_RegisterSubscriber()
          @          0x17c30b3  impala::StatestoreServiceProcessorT<>::dispatchCall()
          @           0xee68df  apache::thrift::TDispatchProcessor::process()
          @          0x131e224  apache::thrift::server::TAcceptQueueServer::Task::run()
          @          0x130ab89  impala::ThriftThread::RunRunnable()
          @          0x130c7b1  boost::detail::function::void_function_obj_invoker0<>::invoke()
          @          0x1930e30  impala::Thread::SuperviseThread()
          @          0x1931c39  boost::detail::thread_data<>::run()
          @          0x2359067  thread_proxy
          @     0x7f5cbef5a6db  start_thread
          @     0x7f5cbbcc061f  clone

      The cluster runs without catalogd HA.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            wzhou Wenzhe Zhou
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment