Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6285

Avoid printing the stack as part of DoTransmitDataRpc as it leads to burning lots of kernel CPU

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.11.0
    • Fix Version/s: Impala 2.11.0
    • Component/s: None
    • Labels:
    • Epic Color:
      ghx-label-9

      Description

      When running on 32 concurrent TPCDS queries against 20 r4.8xlarge some of the RPCs timeout but don't fail the query

      I1206 12:44:14.925405 25274 status.cc:58] RPC recv timed out: Client foo-17.domain.com:22000 timed-out during recv call.
          @           0x957a6a  impala::Status::Status()
          @          0x11dd5fe  impala::DataStreamSender::Channel::DoTransmitDataRpc()
          @          0x11ddcd4  impala::DataStreamSender::Channel::TransmitDataHelper()
          @          0x11de080  impala::DataStreamSender::Channel::TransmitData()
          @          0x11e1004  impala::ThreadPool<>::WorkerThread()
          @           0xd10063  impala::Thread::SuperviseThread()
          @           0xd107a4  boost::detail::thread_data<>::run()
          @          0x128997a  (unknown)
          @     0x7f68c5bc7e25  start_thread
          @     0x7f68c58f534d  __clone
      
      I1206 12:44:15.152775 25296 status.cc:58] RPC recv timed out: Client foo-5.domain.com:22000 timed-out during recv call.
          @           0x957a6a  impala::Status::Status()
          @          0x11dd5fe  impala::DataStreamSender::Channel::DoTransmitDataRpc()
          @          0x11ddcd4  impala::DataStreamSender::Channel::TransmitDataHelper()
          @          0x11de080  impala::DataStreamSender::Channel::TransmitData()
          @          0x11e1004  impala::ThreadPool<>::WorkerThread()
          @           0xd10063  impala::Thread::SuperviseThread()
          @           0xd107a4  boost::detail::thread_data<>::run()
          @          0x128997a  (unknown)
          @     0x7f68c5bc7e25  start_thread
          @     0x7f68c58f534d  __clone
      

      The status can be changed to expected but it is worth verifying that this timeout can be tolerated.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kwho Michael Ho
                Reporter:
                drorke David Rorke
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: