Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-6364

Lock contention in FileHandleCache results in >2x slowdown for remote HDFS reads

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.10.0, Impala 2.11.0
    • Fix Version/s: Impala 2.12.0
    • Component/s: None
    • Labels:
      None
    • Epic Color:
      ghx-label-6

      Description

      IMPALA-4623 introduced a locking schema to the file handle cache which has 16 buckets, this results in lock contention between IO threads which limits system throughput.

      Most IO threads end-up in one of these stacks.

      #0  0x0000000002085d47 in base::internal::SpinLockDelay(int volatile*, int, int) ()
      #1  0x0000000002085c29 in base::SpinLock::SlowLock() ()
      #2  0x00000000010fa76d in impala::io::FileHandleCache<16ul>::GetFileHandle(hdfs_internal* const&, std::string*, long, bool, bool*) ()
      #3  0x00000000010f6e22 in impala::io::DiskIoMgr::GetCachedHdfsFileHandle(hdfs_internal* const&, std::string*, long, impala::io::RequestContext*, bool) ()
      #4  0x00000000010fd514 in impala::io::ScanRange::Open(bool) ()
      #5  0x00000000010f691f in impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, impala::io::ScanRange*) ()
      #6  0x00000000010f6dc4 in impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
      #7  0x0000000000d13333 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
      #8  0x0000000000d13a74 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() ()
      #9  0x000000000128ea3a in thread_proxy ()
      #10 0x00007f49f2bbadc5 in start_thread () from /lib64/libpthread.so.0
      #11 0x00007f49f28e976d in clone () from /lib64/libc.so.6
      
      #0  0x0000000002085d47 in base::internal::SpinLockDelay(int volatile*, int, int) ()
      #1  0x0000000002085c29 in base::SpinLock::SlowLock() ()
      #2  0x00000000010f9929 in impala::io::FileHandleCache<16ul>::ReleaseFileHandle(std::string*, impala::io::HdfsFileHandle*, bool) ()
      #3  0x00000000010fe69e in impala::io::ScanRange::Close() ()
      #4  0x00000000010f6565 in impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, std::unique_ptr<impala::io::BufferDescriptor, std::default_delete<impala::io::BufferDescriptor> >) ()
      #5  0x00000000010f695b in impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, impala::io::RequestContext*, impala::io::ScanRange*) ()
      #6  0x00000000010f6dc4 in impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
      #7  0x0000000000d13333 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
      #8  0x0000000000d13a74 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() ()
      #9  0x000000000128ea3a in thread_proxy ()
      #10 0x00007f49f2bbadc5 in start_thread () from /lib64/libpthread.so.0
      #11 0x00007f49f28e976d in clone () from /lib64/libc.so.6
      

      Increasing the number of partitions to 256 made the contention go away, a simple fix would be to make the number of partitions a startup flag and change it to 256.

        Attachments

        1. d2402_cdh5.12_profile.txt
          73 kB
          Mostafa Mokhtar
        2. d2402_cdh5.13_profile.txt
          75 kB
          Mostafa Mokhtar
        3. remote_hdfs_scan_pstack.txt
          2.57 MB
          Mostafa Mokhtar

          Activity

            People

            • Assignee:
              joemcdonnell Joe McDonnell
              Reporter:
              mmokhtar Mostafa Mokhtar
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: