Uploaded image for project: 'Apache HAWQ (Retired)'
  1. Apache HAWQ (Retired)
  2. HAWQ-839

Libyarn coredump when failover to standby RM

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0.0-incubating
    • libyarn
    • None

    Description

      Start hawq with yarn mode and kill Hadoop Yarn resource manager, coredump happens, the stack is below:
      #0 0x0000003e054325e5 in raise () from /lib64/libc.so.6
      #1 0x0000003e05433dc5 in abort () from /lib64/libc.so.6
      #2 0x00007f04980b1109 in libyarn::HandleYarnFailoverException (e=...)
      at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:170
      #3 0x00007f04980b3211 in libyarn::ApplicationClient::getNewApplication (this=0x1f17cd0)
      at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:215
      #4 0x00007f049809d639 in libyarn::LibYarnClient::createJob (this=0x1f1e500, jobName="hawq", queue="default",
      jobId="")
      at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClient.cpp:163
      #5 0x00007f04980987b8 in createJob (client=0x1f25950, jobName=Unhandled dwarf expression opcode 0xf3
      )
      at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClientC.cpp:61
      #6 createJob (client=0x1f25950, jobName=Unhandled dwarf expression opcode 0xf3
      )
      at /home/gpadmin/workspace/hawq/incubator-hawq/depends/libyarn/src/libyarnclient/LibYarnClientC.cpp:180
      #7 0x00000000008e1117 in RB2YARN_registerYARNApplication ()
      #8 0x00000000008e31ad in RB2YARN_initializeConnection ()
      #9 0x00000000008e358b in ResBrokerMainInternal ()
      #10 0x00000000008e38e8 in ResBrokerMain ()
      #11 0x00000000008dfb66 in RB_LIBYARN_start ()
      #12 0x000000000090ae5e in MainHandlerLoop ()
      #13 0x000000000090b46a in ResManagerMainServer2ndPhase ()
      #14 0x000000000090ba14 in ResManagerMain ()
      #15 0x000000000090bd71 in ResManagerProcessStartup ()
      #16 0x0000000000767f98 in CommenceNormalOperations ()
      #17 0x0000000000768d44 in do_reaper ()
      #18 0x000000000076dbed in ServerLoop ()
      #19 0x000000000076f73e in PostmasterMain ()
      #20 0x00000000006c828a in main ()

      Attachments

        Issue Links

          Activity

            People

              wlin Wen Lin
              wlin Wen Lin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment